# Detect the CPU Core Count From Silverlight

Nov 12, 2009

If you're are writing an application that is heavy on multi-threaded computations (e.g. full screen blur, game, or scientific data processing), you will want to know how many threads to run optimally.

Edit: just fixed a bug reported where the initial assesment woult be 0 msec, thanks Morten for reporting it!

The answer is easy: run as many threads as the CPU cores. For example, on Dual Core, you should run 2 threads and on Quad Core 4 threads.

This is how to find the number of cores:

1. Create a simple computing function (e.g. that adds +1 to a number continuously) and run it with 1, 2, 4, 8, and 16 threads
2. Measure the time it takes for the function to complete for each set of threads.

Once you hit the "core limit" of the client system, the time will significantly increase. Here's an example from my box:

If you look at the above data, you can easily tell I'm running on a quad-core system, because jumping from 4 to 8 threads significanly increases the computational time needed (more than 1.8 times).

This is how to use the source code:

int coreCount = PerformanceMeasure.GetCoreCount();

You can call the above function from the UI thread.

There are also two tweaks in the code that allows it to run roughly at the same speed on all machines and run faster on single-core machines too:

• Once the core limit is hit, the algorithm stops. E.g. if you find that 4 threads take more than 1.8x the time as compared to 2 threads, this means you have 2 cores and there is no need to test with 8 threads
• Before the main algorithm (above) starts, there is an estimation step, which calculates how many operations can be executed for 100 msec on 1 thread. This ensures that the assesment will run fast even on slow machines.

Please comment! I would be interesting to know how well the algorithm works and if it detected your cores as you expected!

 2e8d1706-6778-4190-9d81-7707ddc521d5|3|5.0 silverlight  optimizations

# Best Way To Clear WriteableBitmap?

Nov 11, 2009

If you're doing a lot of custom drawing using WriteableBitmap (e.g. full screen game), it will be extremely important to be able to clear the WriteableBitmap or "screen" quickly.

Lets assume you want to clear the screen to specific color.

What is the best way to do it?

Here is a short comparison of few methods to clear a 512x512 bitmap:

• Clear with for loop: 1000 FPS
• Clear with Array.Copy: 4100 FPS
• Clear with Array.Clear: 11000 FPS

1. Clear with for() loop. This method is the most straight-forward, and also the slowest:

public static void ClearForLoop(int[] pixels, int len, int color)
{
for (int i = 0; i < len; i++)
{
pixels[i] = color;
}
}

2. Clear by using Array.Copy. This method is not only fast, but it also allows to "clear" to an image (not just color), which is great if you have a pre-defined background or something like that.

public static void ClearArrayCopy(int[] pixels, int[] clearTo, int len)
{
Array.Copy(clearTo, 0, pixels, 0, len);
}

This method assumes that you have already pre-initialized the "clear" bitmap (just do it once! :) with the color/image:

// note: do this ONCE!, NOT on every frame! obvious, but worth mentioning just in case

int[] clearScreen = new int[pixels.Length];

for (int i = 0; i < pixels.Length; i++)
{
clearScreen[i] = color;
}

3. Clear with Array.Clear: the fastest way, but unfortunately allows you to clear to 0 only (meaning transparent image).

Array.Clear(pixels, 0, pixels.Length);

Depending on the application you'd either choose Array.CopyTo(), since it's the most versatile or Array.Clear(). You may also choose Array.CopyTo() over Array.Clear() because Array.CopyTo() is easily multithreaded, and can take advantage of multiple cores, while Array.Clear() currently runs on a single thread/core.

Note that all measurements assume single-core used. If you have multiple-core system you can improve the speed quite a bit by running those multithreaded.

 f6a399e0-d3de-413b-89c7-69d83a686e7b|2|5.0 silverlight  optimizations

# Fast DrawLine() in Silverlight

Nov 6, 2009

How fast can you make it go?

int inc = incy1 * w + incx;
for (int i = 0; i < lenY; i++) {
pixels[index >> PRECISION_SHIFT] = color;
index += inc;
}

40FPS * 10000 lines = 400,000 lines/sec

Note: in my perf tests I did a single-threaded version, so if you have multiple cores (2-4), you might be able to get to more than 0.4mln lines/sec :)

I looked at the excellent posts from Rene about Drawing Shapes in Silverlight, and decided to give the DrawLine() code a whirl :) After trying to optimize it for some time, I ended up with code that runs twice as fast!

There is no sample here, because I expect that Rene will integrate it/try it out in his library (that’s really the best place for the code now to avoid multiple sample DLLs)

Here is the complete DrawLine() with my optimizations:

public static void DrawLineFast(this WriteableBitmap bmp, int x1, int y1, int x2, int y2, int color)
{
// Use refs for faster access (really important!) speeds up a lot!

int w = bmp.PixelWidth;
int[] pixels = bmp.Pixels;

// Distance start and end point

int dx = x2 - x1;
int dy = y2 - y1;

const int PRECISION_SHIFT = 8;
const int PRECISION_VALUE = 1 << PRECISION_SHIFT;

// Determine slope (absoulte value)

int lenX, lenY;
int incy1;
if (dy >= 0)
{
incy1 = PRECISION_VALUE;
lenY = dy;
}
else
{
incy1 = -PRECISION_VALUE;
lenY = -dy;
}

int incx1;
if (dx >= 0)
{
incx1 = 1;
lenX = dx;
}
else
{
incx1 = -1;
lenX = -dx;
}

if (lenX > lenY)
{ // x increases by +/- 1

// Init steps and start

int incy = (dy << PRECISION_SHIFT) / lenX;
int y = y1 << PRECISION_SHIFT;

// Walk the line!

for (int i = 0; i < lenX; i++)
{
pixels[(y >> PRECISION_SHIFT) * w + x1] = color;
x1 += incx1;
y += incy;
}
}
else
{ // since y increases by +/-1, we can safely add (*h) before the for() loop, since there is no fractional value for y
// Prevent divison by zero

if (lenY == 0)
{
return;
}

// Init steps and start

int incx = (dx << PRECISION_SHIFT) / lenY;
int index = (x1 + y1 * w) << PRECISION_SHIFT;

// Walk the line!

int inc = incy1 * w + incx;
for (int i = 0; i < lenY; i++)
{
pixels[index >> PRECISION_SHIFT] = color;
index += inc;
}
}
}

## Summary of Optimizations Done

• Moved from using float to using fixed point
• Took advantage of the fact that if the line is longer in the y direction, vs the x direction (vertically-looking line), y will change by 1 on each iteration. This allows me to remove the multiplication in the innermost line drawing loop
• Removed extra variables, so that remaining variables can be optimized by the JIT compiler, hopefully in CPU registers

Hope you like it! Please comment! Also, if you can make it faster, please do!

 18e3598d-2d78-4773-a8c4-40ea790f0073|2|5.0 silverlight  optimizations  drawing