Wednesday, 6 July 2011

Has Microsoft initiated its own Nokia-ization?

Some MS officials, by recently saying that Windows 8 apps will be created in HTML5 and JavaScript and by deciding not to mention anything about .NET and Silverlight have fostered speculation and have introduced some confusion and chaos (see here, here, but mostly here).

Here is my thought about this Jupiter/windows 8 buzz/fiasco:

  1. XAML is not dead: even if the future of WPF seems highly compromised by the advent of Jupiter & windows 8, I think that the pattern of its declarative langage is not dead. The Windows team and their IE peers certainly want to kill Silverlight or WPF for different reasons, but Jupiter seems more like a "Next Generation" XAML-based framework, than a uniquely dedicated and optimized for HTML 5/Javascript UI platform. In that context, I found also very interesting to notice that the XAML team has recently joined the Windows one.
  2. .NET is not dead either: While it seems now realistic that the Windows 8 team will provide an alternative to WPF or Silverlight (with the help of their IE classmates, and surfing on the HTML5 wave), I don't buy the option where the Windows team may be confident enough to initiate the dismantling of the .NET platform (I wish them good luck if they want to offer an alternative to the entire .NET ecosystem built ​since 2002 ;-)
  3. But it's really time to adopt an SOA model for our desktop applications: The volatility of MS UI technologies is so important than we definitively have to avoid from putting all eggs in one (fragile) basket. Here is my advice for desktop applications: wherever it is possible, avoid from embedding business layers whithin them and limit them to consume services instead (whether web/WCF, or through a MOM). The responsibility of our rich desktop applications should only to provide the information in a nice, efficient and ergonomic way (which is all but simple). Note that I'm not talking about web app here (maybe within another post?).
  4. It seems now wise to wait the "Build Conference" in September, in order to have concrete and reliable information on this subject (and not rumors or blog noises)
Anyway, the future don't looks bright for MS with its internal wars/clans (Windows team vs Dev team) and the lack of legibility of its strategy (Silverlight? WPF?).

Microsoft sorely lacking a leader able to provide a vision and to bring every team into line. They definitively need a leader. Moreover, they probably need a designer at the head of their organization. Someone that will try to change people's lifes -whether for good or bad reasons-like Steve Jobs with Apple.

Otherwise it could be the beginnings of a (long) Nokia-ization for the Redmond giant...What a pity!

Saturday, 2 July 2011

About Rx performances

James Miles recently shared some performance figures and explanations related to the latest Rx Performance Improvements.

With this latest v1.0 stable release, Rx seems to enter in a new era. It's a very good news for .NET developers...

Friday, 1 July 2011

Continuous delivery at Facebook

Chuck Rossi is explaining in a video how the Release management is handled at Facebook; it allows them to push daily updates to their site without production outage or service interruption.  This video is very informative and the announced KPIs are very impressive...

watch the video here

While we are talking about release management, I also highly recommand you to dig into the "continuous delivery" book and blogs of Jez Humble and Dave Farley.

Release management is definitively not only about process and tools; it is about changing our culture.

Tuesday, 28 June 2011

Handle your technical debt with SQALE!

You should try SQALE; a generic, language and tools independent method for assessing the quality of source code.

In a nutshell, SQALE aims to manage the Technical debt within your projects; its classification allows you to analyze the impact of the debt and to define the priority of code refactoring/remediation activities. SQALE is also pragmatic and result-oriented: it is a requirement model and not a set of best practices to implement. Based on this requirement model, SQALE will also allow you to produce ratings for all your projects (from A to E).

Even if SQALE is tool independent, the method target for an automated implementation and you can already find some existing concrete solutions (see the SQALE plugin for SONAR for instance).

You should definitively have a look on SQALE.

Thursday, 10 March 2011

'REWORK': a must read by the founders of 37signals

'Meetings are toxic', 'Don't be a hero', 'Fire the workaholics', 'Interruption is the enemy of productivity', 'Build an audience', 'Good enough is fine', 'Go to sleep', 'Pick a fight', 'Planning is guessing' ... are some of the topics explained within the 37signals founders work manifesto.

This irreverent, but very concrete and clever book is definitely a must read.
It tastes like 'The Pragmatic Programmer' manifesto, but applied to the company level.

Monday, 14 February 2011

Write performant code: keep some fundamental figures in mind

It's funny to see the amount of time spent (i would say lost) by some developers to optimize their code in some location... where the ROI will be peanuts at the end of the day ;-(

Worse: premature optimizations. The real ones, I mean (I heard you Joe, and I agree ;-). The ones where  peoples sacrify readability and evolutivity to the hostel of performance (really? did you measured it concretely?). In most cases, a simple but wise choice of relevant types and datastructures within our code save us lots of time and energy without creating maintenability nightmares.

If you want to develop low latency and scalable solutions, it is obvious that you should know the core mechanisms of your platform (.Net, Java, C++, windows, linux, but also processors,  RAM, network stacks and adapters...). How the GC works, how the memory and proc cache lines are synchronized, etc.

But do you have in mind the cost of some elementary operations (in term of time and CPU cycles) when you are coding? How long it takes to make a classic network hop? How long it takes to make a typical I/O read? How long to access the memory depending on its current state?

As a reminder, here is some figures (some are borrowed to Joe Duffy's blog) that you should definitely post-it in front of your development desktop. If you don't want to improve your code blindly, it's important to know what things cost.

  • a register read/write (nanoseconds, single-digit cycles)
  • a cache hit (nanoseconds, tens of cycles)
  • a cache miss to main memory (nanoseconds, hundreds of cycles)
  • a disk access including page faults (micro- or milliseconds, millions of cycles)
  • a local network hop with kernel-bypassing RDMA & 10GigEth (sub 10 microseconds)
  • a LAN network hop (100-500 microseconds)
  • a WAN network roundtrip (milliseconds or seconds, many millions of cycles)

Wednesday, 9 February 2011

The ultimate MOM?


I've never had faith in silver bullets ;-) but I have to admit that Solace (appliance-based) solution is very attractive for a pre-trade (but also post-trade) financial entreprise messaging system.

Pros
Because Solace use silicon instead of software for their "hub and spoke" oriented messaging solution (fully compliant with JMS standard, but with much more features) , there is no OS interrupts, context switching nor data copies between kernel and user space for the "hub" part.

I still didn't had the chance to evaluate it, but on the paper and according to some of my ex-colleagues that had made evaluations,  Solace looks like a kind of ultimate solution for financial entreprise messaging system.

"Reliable delivery with average latency of 22 microseconds at 1M msgs/sec", "Guaranteed delivery with average latency of 98 microseconds at 150,000 msgs/sec",  "10 million topics with support for multi-level, wild carded topics", "9,000 client connections"... (http://www.solacesystems.com/docs/solace_corporate-intro.pdf). Such figures make me dream and remind us that software can't win over hardware for those message-oriented middleware (MOM) use cases...

Their several white papers are very relevants and informatives. In particular, the one that explain how to build a single dealer platform 
(meaning: a web oriented application produced by an investment banking in order to allow all its clients to directly cope/deal with it). Oh Yes, because the future version of their appliance will also allow to bridge with http clients (one more killer feature ;-)

This particular white paper is available from here: http://www.solacesystems.com/solutions/financial-services/single-dealer-platform

Cons
More than the (high) price of such a solution, and even if it is increasingly used in some banks, perhaps risks regarding the sustainability of such hardware-based solutions that has to be lifted.

Indeed, what would happen if their (unique?) appliance production factory get burnt? how long would it take to Solace to replace this mass production to fulfill contracts, etc.?

Tuesday, 4 January 2011

Low latency systems in .NET

Not new, but since i'm currently in a low latency mood ;-)

Back to june 2009, Rapid Addition, one of the leading supplier of front office messaging components to the global financial services industry, published a white paper with Microsoft on how to build ultra-low latency FIX engine with 3.5 .NET framework...

Their whitepaper (and RA Generation Zero framework) mostly indicates how to prevent from having gen2 garbage collection.

I really like their description of a managed low latency system with a startup phase (with memory resource pooling preallocation), a forced full GC phase, then a continuous operation phase for our low latency systems (where we avoid to cause garbage collection by using and recycling resource pools).

According to me, this white paper demonstrates that Rapid Addition really know what they are talking about.

In another press communication, they have said that their FIX engine breaks the 10 micro seconds average latency with a 12,000 messages/second throughput (still with the 3.5 .NET framework based solution).

Very informative.

Sunday, 2 January 2011

A new perspective for Ultra low latency performant systems

I just watched an awesome presentation by Martin Thompson and Michael Barker. They explained how they implemented their ultra low latency (with high throughput) systems for the London Multi-Asset eXchange (LMAX) and it's pretty impressive: 100 000 transactions per seconds at less than 1 millisecond latency in Java...

Since I'm working on FX and low latency systems in general about several years from now, I was very interested by their teasing (100K...). I have to admit that I was thrilled by their presentation.

For those that didn't have one hour to kill watching their video, here is a summary:

---HOW TO DO 100K TPS AT LESS THAN 1ms LATENCY----------------------------
  1. UNDERSTAND YOUR PLATFORM
  2. CHECK YOUR PERFORMANCE FROM THE BEGINNING
  3. FOLLOW THE TIPS
---------------------------------------------------------------------------------------------------------------


UNDERSTAND YOUR PLATFORM
  • You have to know how modern hardwares work in order to build ultra low latency systems
  • The advent of multi-cores with their bigger and smarter caches (do you really know about how proc cache synchronization is working? false sharing drawbacks? etc)
  • Ok, free lunch is over (for Ghz), but it's time to order and use more memory!!! (144GB servers with 64bits addressing for instance)
  • Disk is the new tape! (fast for sequential access); rather use SSDs for random threaded access
  • Network is not slow anymore: 10GigE is now a commodity and you can have sub 10 microseconds for local hop with kernel-bypassing RDMA
  • (Not hardware, but) understand how GC and JIT work (under the hood)


CHECK YOUR PERFORMANCE FROM THE BEGINNING
  • Write Performance tests first
  • Make it run automatically and nightly to detect when you should start to optimize your code
  • Still no need for early and premature performance optimizations


FOLLOW THE TIPS
  • Keep the working set in-memory (data and behaviour co-located)
  • Write cache (cache-lines synchronization) friendly code (the rebirth of the arrays ;-)
  • Choose your data structure wisely
  • Queues are awful for concurrency access, rather use Ring Buffers instead (no stress: we just said that we bought lot of memory ;-)
  • Use custom cache friendly collections
  • Write simple, clean & compact code (the JIT always do better with simpler code-shorter methods are easy to inline)
  • Invest in modeling your domain. Also respect the single responsibility principle (one class one thing, one method one thing,...) and the separation of concerns
  • Take the right approach to the concurrency. Concurrent programming is about 2 things: mutual exclusion and visibility of changes which can be implemented following two main approaches. i) a difficult locking approach (with context switch to kernel), and ii) a VERY difficult atomic with no blocking (user space) instructions (remember how are implemented optimistic locking mechanism within databases). You should definitely choose the second one
  • Keep the GC under control. Because the GC may pause your application, you should avoid it by implementing (circular buffer) preallocation and by using a huge amount of (64bits) memory
  • Run business logic on a single thread and push the concurrency in the infrastructure. Because trying to put concurrency within the business model is far too hard and easy to get wrong. You would also turn the OO programmers dream of: easy to write testable and readable code. As a consequence, it should increase your time to market.
  • Follow the disruptor pattern which is a system pattern that tries to avoid contention wherever possible (even with business logic running on a single thread)
---------------------------------------------------------------------------------------------------------------

Ok, this presentation leave us with lot of questions about how they implemented their systems. However, I found this video very refreshing and interresting regarding several points (the need for huge amount of RAM with 64 bits addressing, the interest of network kernel-passing technologies with RDMA, the fact that queues are bad to handle concurrency properly, that clean and simple code doesn't prevent you from having excellent performances, that concurrency should stay out of the business logic, that the separation of concerns allow you to have very competitive time-to-market,... and of course that you need a write-performance-tests-first approach).


For the disruptor pattern explanations and much more, see the entire video of this presentation here: http://www.infoq.com/presentations/LMAX

Note: don't see this video in full-screen mode otherwise you will loose the benefits of the slides displayed below ;-(

Cheers, and happy new year to everyone ;-)

Saturday, 6 November 2010

The 9 indispensable DEBUGGING RULES

It's been a while since i've posted on this blog (glad to be back ;-)

As I was recently in touch with some young developers that seemed lost when encountering issues with libraries or technologies they just discovered, I thought that it could be interesting to refresh/share a synthesis on the 9 indispensable DEBUGGING RULES of the MUST READ: Debugging—The Nine Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems (David J. Agans - ©2002)

Ok then, here are the priceless rules:


[quote starts here]

Memorize them. Tape them to your wall. Tape them to all of your walls.

THE 9 DEBUGGING RULES

  1. UNDERSTAND THE SYSTEM
  2. MAKE IT FAIL
  3. QUIT THINKING AND LOOK
  4. DIVIDE AND CONQUER
  5. CHANGE ONE THING AT A TIME
  6. KEEP AN AUDIT TRAIL
  7. CHECK THE PLUG
  8. GET A FRESH VIEW
  9. IF YOU DIDN'T FIX IT, IT AIN'T FIXED

1. UNDERSTAND THE SYSTEM

This is the first rule because it's the most important. Understand?

  • Read the manual. It'll tell you to lubricate the trimmer head on your weed whacker so that the lines don't fuse together.
  • Read everything in depth. The section about the interrupt getting to your microcomputer is buried on page 37.
  • Know the fundamentals. Chain saws are supposed to be loud.
  • Know the road map. Engine speed can be different from tire speed, and the difference is in the transmission.
  • Understand your tools. Know which end of the thermometer is which, and how to use the fancy features on your Glitch−O−Matic logic analyzer.
  • Look up the details. Even Einstein looked up the details. Kneejerk, on the other hand, trusted his memory.

2. MAKE IT FAIL

It seems easy, but if you don't do it, debugging is hard.

  • Do it again. Do it again so you can look at it, so you can focus on the cause, and so you can tell if you fixed it.
  • Start at the beginning. The mechanic needs to know that the car went through the car wash before the windows froze.
  • Stimulate the failure. Spray a hose on that leaky window.
  • But don't simulate the failure. Spray a hose on the leaky window, not on a different, "similar" one.
  • Find the uncontrolled condition that makes it intermittent. Vary everything you can—shake it, rattle it, roll it, and twist it until it shouts.
  • Record everything and find the signature of intermittent bugs. Our bonding system always and only failed on jumbled calls.
  • Don't trust statistics too much. The bonding problem seemed to be related to the time of day, but it was actually the local teenagers tying up the phone lines.
  • Know that "that" can happen. Even the ice cream flavor can matter.
  • Never throw away a debugging tool. A robot paddle might come in handy someday.

3. QUIT THINKING AND LOOK

You can think up thousands of possible reasons for a failure. You can see only the actual cause.

  • See the failure. The senior engineer saw the real failure and was able to find the cause. The junior guys thought they knew what the failure was and fixed something that wasn't broken.
  • See the details. Don't stop when you hear the pump. Go down to the basement and find out which pump.
  • Build instrumentation in. Use source code debuggers, debug logs, status messages, flashing lights, and rotten egg odors.
  • Add instrumentation on. Use analyzers, scopes, meters, metal detectors, electrocardiography machines, and soap bubbles.
  • Don't be afraid to dive in. So it's production software. It's broken, and you'll have to open it up to fix it.
  • Watch out for Heisenberg. Don't let your instruments overwhelm your system.
  • Guess only to focus the search. Go ahead and guess that the memory timing is bad, but look at it before you build a timing fixer.

4. DIVIDE AND CONQUER

It's hard for a bug to keep hiding when its hiding place keeps getting cut in half.

  • Narrow the search with successive approximation. Guess a number from 1 to 100, in seven guesses.
  • Get the range. If the number is 135 and you think the range is 1 to 100, you'll have to widen the range.
  • Determine which side of the bug you are on. If there's goo, the pipe is upstream. If there's no goo, the pipe is downstream.
  • Use easy−to−spot test patterns. Start with clean, clear water so the goo is obvious when it enters the stream.
  • Start with the bad. There are too many good parts to verify. Start where it's broken and work your way back up to the cause.
  • Fix the bugs you know about. Bugs defend and hide one another. Take 'em out as soon as you find 'em.
  • Fix the noise first. Watch for stuff that you know will make the rest of the system go crazy. But don't get carried away on marginal problems or aesthetic changes.

5. CHANGE ONE THING AT A TIME

You need some predictability in your life. Remove the changes that didn't do what you expected. They probably did something you didn't expect.

  • Isolate the key factor. Don't change the watering schedule if you're looking for the effect of the sunlight.
  • Grab the brass bar with both hands. If you try to fix the nuke without knowing what's wrong first, you may have an underwater Chernobyl on your hands.
  • Change one test at a time. I knew my VGA capture phase was broken because nothing else was changing.
  • Compare it with a good one. If the bad ones all have something that the good ones don't, you're onto the problem.
  • Determine what you changed since the last time it worked. My friend had changed the cartridge on the turntable, so that was a good place to start.

6. KEEP AN AUDIT TRAIL

Better yet, don't remember "Keep an Audit Trail." Write down "Keep an Audit Trail."

  • Write down what you did, in what order, and what happened as a result. When did you last drink coffee? When did the headache start?
  • Understand that any detail could be the important one. It had to be a plaid shirt to crash the video chip.
  • Correlate events. "It made a noise for four seconds starting at 21:04:53" is better than "It made a noise."
  • Understand that audit trails for design are also good for testing. Software configuration control tools can tell you which revision introduced the bug.
  • Write it down! No matter how horrible the moment, make a memorandum of it.

7. CHECK THE PLUG

Obvious assumptions are often wrong. And to rub it in, assumption bugs are usually the easiest to fix.

  • Question your assumptions. Are you running the right code? Are you out of gas? Is it plugged in?
  • Start at the beginning. Did you initialize memory properly? Did you squeeze the primer bulb? Did you turn it on?
  • Test the tool. Are you running the right compiler? Is the fuel gauge stuck? Does the meter have a dead battery?

8. GET A FRESH VIEW

You need to take a break and get some coffee, anyway.

  • Ask for fresh insights. Even a dummy can help you see something you didn't see before.
  • Tap expertise. Only the VGA capture vendor could confirm that the phase function was broken.
  • Listen to the voice of experience. It will tell you the dome light wire gets pinched all the time.
  • Know that help is all around you. Coworkers, vendors, the Web, and the bookstore are waiting for you to ask.
  • Don't be proud. Bugs happen. Take pride in getting rid of them, not in getting rid of them by yourself.
  • Report symptoms, not theories. Don't drag a crowd into your rut.
  • Realize that you don't have to be sure. Mention that the shirt was plaid.

9. IF YOU DIDN'T FIX IT, IT AIN'T FIXED

And now that you have all these techniques, there's no excuse for leaving it unfixed.

  • Check that it's really fixed. Don't assume that it was the wires and send that dirty fuel filter back onto the road.
  • Check that it's really your fix that fixed it. "Wubba!" might not be the thing that did the trick.
  • Know that it never just goes away by itself. Make it come back by using the original Make It Fail methods. If you have to ship it, ship it with a trap to catch it when it happens in the field.
  • Fix the cause. Tear out the useless eight−track deck before you burn out another transformer.
  • Fix the process. Don't settle for just cleaning up the oil. Fix the way you design machines.

(David J. Agans - ©2002)

[quote ends here]

Wednesday, 3 September 2008

The cleaning wiimote

A japanese guy found a way to control a vacuum cleaner with a wiimote. It's fun to see what can be done with this awesome device. Click here to see the video.

This reminds me an old post dedicated to a Managed Library for Nintendo's Wiimote

Tuesday, 2 September 2008

The vicious deadlock situation (the one that does not freeze the GUI but leaks memory)

This post is dedicated to windows forms UI deadlock situations.

I) The classical deadlock situation (the one that freeze the UI)

This happens when the synchronization with the UI thread is made in a synchronous manner (Control.Invoke(...), SynchronizationContext.Send(...) etc.).
Solution: To fix this kind of deadlock, you may use asynchronous API to delegate tasks execution to the UI thread.

Those asynchronous API may be :

  • System.Threading.SynchronizationContext.Post(...)
  • Control.BeginInvoke(...) - but this may require too much CPU due to the underneath .NET reflection usage - I'll write a post on that topic later)
  • ...

II) The vicious deadlock situation (the one that does not freeze the UI but leaks memory)

This kind of deadlock is slightly more difficult to diagnostic. It happens when a "non UI" thread is making a call to the Invoke method of a Control that belongs to a closed Windows Form (this may happen just after a close to the Form).

In that case, the execution blocks indefinitely on the call to Control.Invoke() and the delegate supply to it never begins (see here). => this prevent the lock from being released by the blocked thread !


Solution: Prevent from situations where "zombie controls" are accessed (beware to unsubscribe all the .NET events to prevent from maintaining object alive) and use one of the asynchronous API to delegate tasks execution to the UI thread.