September 5 2010




 
Search Blog Entries:



What is this?

Article Details
 
Real-time Behavior of the Microsoft .NET Compact Framework

Introduction

With the arrival of Microsoft Visual Studio® .NET 2003 with integrated support for Smart Device Applications it is possible to develop applications for a broad range of devices using managed code. Software developers can now use new exciting languages like Microsoft Visual Basic® .NET and Microsoft Visual C#® for device development. Although this sounds promising one question is still to be answered. Is it possible to make use of the real time capabilities of Windows CE .NET while using managed code to write applications for an embedded device? In this paper we will answer that question and we suggest a possible scenario in which real-time behavior can be combined with .NET functionality.

A Managed and an Unmanaged World

Some of the advantages of a managed environment like Microsoft’s Common Language Runtime such as writing safer and platform independent software might turn out to be a disadvantage in a real-time environment. Typically, we cannot afford to wait for a just-in-time compiler to compile a method prior to using it, and we cannot wait on a garbage collector to perform its duty, clearing previously allocated memory by removing unused resources. Both these features might interfere with deterministic system behavior. It is possible to force the garbage collector to do its duty, calling GC.Collect(). However, we want the GC to perform its task by itself, since it is highly optimized. To allow hard real-time behavior, it would be great if there was a way in which we could distinguish between hard real-time functionality, written in native or unmanaged Microsoft Win32® code and other functionality, written in managed code. Making use of Platform Invoke or P/Invoke, we can just do that.

Platform Invoke at Work

The simple definition in MSDN help states that P/Invoke is the functionality provided by the common language runtime to enable managed code to call unmanaged native DLL entry points. In other words, P/Invoke gives us an escape route from managed .NET code to unmanaged Win32 code. To be able to use this mechanism within Windows CE .NET, native Win32 functions that we want to call must be defined extern public within a dynamic link library. Since the managed .NET environment does not know anything about Microsoft Visual C++® name mangling, the functions to be called from within a managed application should have C naming conventions as well. To be able to use functionality from within a DLL, we need to build a wrapper class around the function entry points from within our managed application. Listing 1 shows an example of a small, unmanaged DLL and a wrapper class in managed code.

Listing 1a - Wrapper class to be able to P/Invoke into a DLL

Listing 1b - Call an unmanaged function from within managed code

Listing 1a & 1b: Calling into unmanaged code

Listing 1c - Unmanaged code to call into

Listing 1c: Win32 DLL to be called from within managed code

Using the wrapper class, it is possible to call functions that exist inside the DLL. Since this mechanism works for all exported DLL functions and since almost all Win32 APIs are exported in coredll.dll, this mechanism also provides a way to call into almost any Win32 API. P/Invoke is used in our test to have a managed application calling into an unmanaged real-time thread.

A Real-time Scenario

Imagine the following scenario: A system needs hard real-time functionality to retrieve information from an external source. The information is stored in the system and will be presented to the user in some graphical way. Figure 1 shows a possible scenario for this problem.

A possible real-time scenario

We see a real-time thread receiving an interrupt from an external source. The thread processes the interrupt and stores relevant information to be presented to the user. On the right-hand side, a separate UI thread, written in managed code, reads information that was previously stored by the real-time thread. Given the fact that context switches between processes are expensive, we want the entire system to live within the same process. If we separate real-time functionality from user interface functionality by putting real-time functionality in a dynamic link library and providing an interface between that DLL and the other parts of the system, we have achieved our goal of having one single process dealing with all parts of the system. Communication between the UI thread and the RT thread is possible by means of P/Invoking into the native Win32 code.

The Actual Test

We want to make the test representative, yet as simple as possible so it can be repeated easily on other systems as well. For that purpose, the source code to run the experiment yourself is available for download. Our test requires a way to feed interrupts into the system and a possibility to output probes to be able to measure the performance of the system. We feed the system using a block wave, generated by a signal generator. Of course the  Windows CE .NET platform should be capable of hosting the .NET CF. Paul Yao has written an article indicating which Windows CE .NET modules and components should be present to run managed applications (see “Microsoft .NET Compact Framework for Windows CE .NET” on msdn.microsoft.com). The aim of the test is not only to be representative and reproducible. Just find a suitable interrupt source for input. Listing 2 shows how to hook a physical interrupt to an interrupt service thread (IST).

Listing 2 - Hooking up an interrupt to an IST

Listing 2: Connecting a physical interrupt to an interrupt service thread

To test the real-time behavior of an application making use of managed code and the .NET Compact Framework we have created a Windows CE platform, based on Standard SDK. We also included the RTM version of the .NET Compact Framework in the platform. The operating system runs on a Geode GX1 at 300 MHz. We feed the system with a block wave, immediately connected to the IRQ5 line on the PC104 bus (pin 23). Figure 2 shows the system used for the experiment. The frequency of the block wave is 10 kHz. On uprising flanks, an interrupt is generated. The interrupt is processed by an interrupt service thread (IST). In the IST we send out probe pulses to the parallel port to be able to view an output signal. We also store the time at which the IST was activated making use of the high resolution QueryPerformanceCounter API. To be able to measure timing information over a long period of time, we also store maximum and minimum time as well as average time. The time from interrupt occurrence to probe output is an indication of IRQ – IST latency. The timing information acquired by the high resolution timer indicates when the IST is activated. Ideally this value should be 100 µ sec. for an interrupt rate of 10 kHz. All timing information is passed to the graphical user interface on regular intervals.

The actual test system

As the .NET CF itself can not be used in hard real-time situations as explained earlier, we decided to use it for presentation purposes only and to use a DLL, written in eVC++ 4.0 for all real-time functionality. For communication between the DLL and the .NET CF GUI a double buffering mechanism is used in combination with P/Invoke. The GUI requests new timing information on regular intervals, making use of a System.Threading.Timer object. The DLL decides when it has time available to pass information to the GUI. Until data is ready, the GUI is blocked. The refresh rate of the information presented in the GUI is user selectable. For our test we used a refresh rate of 50 msec.

The following pseudocode explains the operation of the IST and the mechanism by which the GUI retrieves information, stored in the native Win32 DLL.

Interrupt Service Thread:

Listing 3 - Interrupt Service Thread 

Managed code periodical update of display data:

 Listing 4 - Managed periodic display update

During the test we hooked up an oscilloscope and made printouts of both the scope and the Windows CE graphical display 10 minutes into the experiment. In figure 3 the interrupt latency, measured with an oscilloscope is displayed. Best case, the latency is 14.0 µ sec., worst case the latency is 54.4 µ sec, meaning a jitter of 40.4 µ sec. In figure 4 the periodic time is displayed when the IST is activated. This figure is a screen shot of the actual user interface. Ideally the IST should run every 100 µ sec, which is also the average time during our measurement (the blue line in the middle). We also measured overall minimum (green) and maximum (red) times, as well as minimum and maximum times over the sample period of 50 msec (the white block). The deviation we found during the test period is limited to ± 40 µ sec.

 

Managed application: IRQ - IST latency

 

Managed application: IST activation times after running 10 minutes

The Results

We measured over a longer period of time to make sure that both the Garbage Collector and the JIT compiler were frequently active. Thanks to the folks at Microsoft, we were able to monitor the behavior of the .NET CF because they provided us with a performance counters registry key. Using this key, a number of performance counters within the .NET CF are activated. We mainly used this performance information to verify that JITter and Garbage Collector actually ran. It also gave a nice indication about the number of objects used during the cause of the test.

Listing 5 - Handling timer messages in a managed world

Listing 5: Handling timer messages in a managed world

As you can see in listing 5, we instantiate a number of objects each time we periodically update the screen. These objects, 2 pens and a graphics object are created during each screen update. Both functions td.ShowValue and td.SetTimerPointer also create brushes. Since td.SetTimerPointer is called twice per screen update, a total of 6 objects are created during each update of the screen. Since we update the screen every 50 msec. a total number of 120 objects are created per second. Over 10 minutes of execution, 72000 objects are created. All these objects are potentially subject to garbage collection. In table 1, the number of allocated objects roughly corresponds to these theoretical values.

We have included performance counter results for both a 10 minute and a 100 minute run. This data was recorded during our actual test. As you can see, after running 10 minutes, garbage collection occurred without noticeable fallbacks in performance. Table 2 shows the performance counters for a run of approx. 100 minutes. In this run full garbage collection occurred. During this run, only 461499 objects were created instead of the 720000 expected objects. This is approximately 35% less than expected. The difference is likely to be caused by the performance counters which, according to Microsoft result in a performance penalty of about 30% within the managed application. However, real-time behavior of the system was not influenced as you can see in the following figure.

Managed application: IST activation times after running 100 minutes

Extra proof for the fact that the garbage collector and the JITter did not influence real-time behavior can be found in the remote process viewer. In the next figure you can see a screen dump of the remote process viewer for the managed application. All threads in the application (except the real-time thread with priority 0) run at normal priorities (251). During our measurements we did not find that the JITter or garbage collector needed kernel blocking to perform their tasks.

Remote process viewer showing the managed application

Table 1: .NET CF performance results after running the test for ten minutes

Counter

Value

n

Mean

min

max

Execution Engine Startup Time

492

0

0

0

0

Total Program Run Time

603752

0

0

0

0

Peak Bytes Allocated

1115238

0

0

0

0

Number Of Objects Allocated

66898

0

0

0

0

Bytes Allocated

1418216

66898

21

8

24020

Number Of Simple Collections

0

0

0

0

0

Bytes Collected By Simple Collection

0

0

0

0

0

Bytes In Use After Simple Collection

0

0

0

0

0

Time In Simple Collect

0

0

0

0

0

Number Of Compact Collections

1

0

0

0

0

Bytes Collected By Compact Collections

652420

1

652420

652420

652420

Bytes In Use After Compact Collection

134020

1

134020

134020

134020

Time In Compact Collect

357

1

357

357

357

Number Of Full Collections

0

0

0

0

0

Bytes Collected By Full Collection

0

0

0

0

0

Bytes In Use After Full Collection

0

0

0

0

0

Time In Full Collection

0

0

0

0

0

GC Number Of Application Induced Collections

0

0

0

0

0

GC Latency Time

357

1

357

357

357

Bytes Jitted

14046

259

54

1

929

Native Bytes Jitted

70636

259

272

35

3758

Number of Methods Jitted

259

0

0

0

0

Bytes Pitched

0

0

0

0

0

Number of Methods Pitched

0

0

0

0

0

Number of Exceptions

0

0

0

0

0

Number of Calls

3058607

0

0

0

0

Number of Virtual Calls

1409

0

0

0

0

Number Of Virtual Call Cache Hits

1376

0

0

0

0

Number of PInvoke Calls

176790

0

0

0

0

Total Bytes In Use After Collection

421462

1

421462

421462

421462

Table 2: .NET CF performance results after running the test for hundred minutes

Counter

Value

n

mean

min

max

Execution Engine Startup Time

478

0

0

0

0

Total Program Run Time

5844946

0

0

0

0

Peak Bytes Allocated

1279678

0

0

0

0

Number Of Objects Allocated

461499

0

0

0

0

Bytes Allocated

8975584

461499

19

8

24020

Number Of Simple Collections

0

0

0

0

0

Bytes Collected By Simple Collection

0

0

0

0

0

Bytes In Use After Simple Collection

0

0

0

0

0

Time In Simple Collect

0

0

0

0

0

Number Of Compact Collections

11

0

0

0

0

Bytes Collected By Compact Collections

8514912

11

774082

656456

786476

Bytes In Use After Compact Collection

1679656

11

152696

147320

153256

Time In Compact Collect

5395

11

490

436

542

Number Of Full Collections

2

0

0

0

0

Bytes Collected By Full Collection

397428

2

198714

1916

395512

Bytes In Use After Full Collection

79924

2

39962

17328

62596

Time In Full Collection

65

2

32

2

63

GC Number Of Application Induced Collections

0

0

0

0

0

GC Latency Time

5460

13

420

2

542

Bytes Jitted

19143

356

53

1

929

Native Bytes Jitted

95684

356

268

35

3758

Number of Methods Jitted

356

0

0

0

0

Bytes Pitched

85304

326

261

35

3758

Number of Methods Pitched

385

0

0

0

0

Number of Exceptions

0

0

0

0

0

Number of Calls

21778124

0

0

0

0

Number of Virtual Calls

1067

0

0

0

0

Number Of Virtual Call Cache Hits

1029

0

0

0

0

Number of PInvoke Calls

1996991

0

0

0

0

Total Bytes In Use After Collection

5632119

13

433239

84637

493054

Pitfalls

During the test, increasing the frequency of the block wave led to unexpected results in the managed application. Especially in the situation where the screen needed frequent repaints (because areas of the screen were invalid), the application randomly hung up the system. Investigation of this problem showed unexpected behavior for experienced Win32 programmers. In a Win32 application, using a timer results in a WM_TIMER message each time a timer expires. However, in the message queue WM_TIMER messages are low priority messages, only posted when there are no other higher priority messages to be processed. This behavior can possibly lead to missing timer ticks, but since CreateTimer does not give us an accurate timer to begin with. This is no problem, especially if the timer is used to update a graphical user interface. However, in the managed application, we use a System.Threading.Timer object to create a timer. This timer calls a delegate every time the timer expires. The delegate is called from within a separate thread that exists in a thread pool. If the system is too busy with other activities, like repainting an entire screen, more timer delegates, each in separate threads, are activated before previously activated delegates are finished. This might lead to consuming all available threads from the thread pool, causing the system to hang. The solution to prevent this behavior is found in listing 3. Each time a timer delegate is activated, we stop the timer object by invoking the Change method of the Timer object, to indicate that we do not want the next timer message until we have processed the current one.

Proof of Results

To be able to compare the results of our experiment with typical results in the same setting, we also wrote a Win32 application that invoked the same DLL with real-time functionality. The Win32 application is functionally identical to the managed application. It provides the system with a graphical user interface in which timing information is displayed in a window. This application paints timing results upon reception of WM_TIMER messages, solely making use of Win32 APIs. The update rate of the screen for both applications is user selectable, but for both applications we chose an update rate of 50 milliseconds. Basically we did not find any difference in performance, as figures 6 and 7 show. In figure 6 the interrupt latency is again measured with an oscilloscope. For the Win32 application, the latency is 14.4 µ sec. Worst case the latency is 55.2 µ sec, meaning a jitter of 40.8 µ sec. These results are identical to the test run with a .NET CF managed application. In figure 7 the periodic time is displayed when the IST is activated, again for the Win32 application. Again, the results are identical to the results of a .NET CF managed application. The source for the Win32 application is also downloadable so you can compare the behavior of the two different applications yourself.

Win32 application: IST activation times after running 10 minutes

Win32 application: IST activation times after running 10 minutes

Conclusion

First we need to make absolutely sure that you understand that we are not suggesting the .NET CF for any real-time work by itself. We suggest that it can be used advantage as a presentation layer. In such a system, the .NET CF can "peacefully co-exist" with real-time functionality, not affecting the real time behavior of Windows CE .NET. In this article we have not benchmarked the graphics capabilities of the .NET CF. In our situation we did not find any significant difference in an application, written entirely in Win32 or an application, partly written in a managed environment with Visual C#. Given the higher programmer productivity and the richness of the .NET Compact Framework, there are many advantages in writing presentation layers in managed code and writing hard real-time functionality in unmanaged code. The clear distinction between these different types of functionality is something you will get for free, using this approach.

Acknowledgments

We have been thinking quite a while about testing the usability of the .NET Compact Framework in real-time scenarios. This test was only possible by cooperating with people and companies that could provide us with the proper hardware and measuring equipment. Therefore we like to thank Willem Haring of Getronics for his support, ideas and hospitality during this project. We also like to thank the folks at Delem for their hospitality and for providing us with the necessary equipment to execute our tests.

About the Authors

Michel Verhagen works at PTS Software bv in the Netherlands. Michel is a Windows CE consultant, has 4 years experience with Windows CE. His main expertise lies in the area of Platform Builder.

Maarten Struys also works at PTS Software bv. There he is responsible for the real-time and embedded competence center. Maarten is an experienced Windows (CE) developer, having worked with Windows CE since its introduction. Since 2000, Maarten is working with managed code in .NET environments. He is also a freelance journalist for the two leading magazines on embedded systems development in The Netherlands. He recently opened a website with information about .NET in the embedded world.

Call to Action and Resources

http://www.pts.nl/

http://www.getronics.nl/

http://www.delem.nl/

www.microsoft.com/embedded

To download the source files used in this article, visit Microsoft's MSDN website.

Acronyms and Terms

·         DLL                  Dynamic Link Library

·         GC                   Garbage Collector

·         GUI                   Graphical User Interface

·         IST                   Interrupt Service Thread

·         P/Invoke            Platform Invoke

·         RT                    Real-time

·         RTM                 Ready to Manufacture

·         SDK                 Software Development Kit

·         UI                     User Interface

 

 
Back








SpiralFX Technology Solutions
www.spiralfx.com


Do you want to learn developing a full blown Windows Mobile Application? This article and accompanying multimedia content will help you to do so. It will be extended over the upcoming weeks / months, so check back regularly.
 
Read Full Article