Debugging Large Scale Service Oriented Systems

Debugging Large Scale Service Oriented Systems

How do you track down a problem in production? Having a homogenouis system, that you can query and that has all the information for a request that passes your production environment is really powerful.

Used approach:

  • Elastic Search
  • Rack-Middlewares, that report stuff
  • Integration with Resque / background jobs
  • Use Thread['local'] - variables, to pass information around
  • inspired by Google Dapper

Comments