Showing posts with label monitoring. Show all posts
Showing posts with label monitoring. Show all posts

Thursday, October 31, 2013

a monitoring story with mongo MMS at Pixable.

A few hours ago I was talking to my Boss about the benefits of working with monitoring tools developed by the same developers of the relying tools, the best example we found was Mongodb and the MMS (MongoDB Management Service), we decide to make a short tale about that and we want to share it with you:


"Every night I lived in fear. Sleeping was difficult, but not because of nightmares, it was because of alerts on my cellphone saying that our API queues were growing, response times spiking and everything was slowly falling apart.

The problem did not occur at the same time each day, which made it more difficult to debug. 
Finally our super duper architect installed MMS. The second day we used MMS I saw the light at the end of the tunnel. There was a clear spike on the page faults indicator at exactly 3:00am. Mongo was doing everything possible to keep working but it eventually failed minutes or hours later, that is why we were never getting the alert at the same time.

So, easy no? It's a cron job. So we went to our code, look for all cron jobs running at that time and we found one that used to loop trough every single user on the system, but that query had the read preference of secondary only. So what was it?

After digging on what the cron was doing, we found, for each user, it was doing a very very simple query, hitting random places of another collection, and this was making mongo to page fault A LOT. 

Eventually the paging removed all signs of hot data from ram on the Primary and the cluster would become a slow wagon. 

We switched that query to run on secondaries, and now we are a happy family again. I sleep like a baby now."


Our final message is that you have to work with the tools that the providers give you in order to detect the problems or opportunities to improve. Do not get me wrong, as any tool it could be improved (knowing the mongodb people, I am sure it will). We love mongodb and we believe in the MMS capabilities.

Check it out at http://mms.mongodb.com

Thanks to Julio Viera for working together with me in the development of this post.

Wednesday, July 3, 2013

my app is not growing, should i become Steve Jobs?

Today, when I was having lunch with a really good friend, one of our topics of conversation was about the process of getting more users and keeping them using the app. Let's face it, the only reason we can increase the valoration of our softwares is based on how many user we have and the retention. Here, I would like to quote “The Lean Startup”, Eric Reis talks about the importance of measuring as a part of your product development cycle. Seeing how your users react to and interact with new features and ideas should influence your decisions about those features and ideas.

That is a key, we need to measure things like:
  • Daily Active Users (DAU)
  • Monthly Active Users (MAU)
  • New Users
  • Retention
Those are really important values that show you if you are on the right direction, but knowing the retention do not give you the answer to what to do, lets suppose that the users are not coming back to your website, you will see a drop in the DAU, so do you know what to do? if you want to solve the DAU drop you will need to find out how to get your users back. So this is the answer to your problem:

you need that the user to likes your app

It is kind of obvious that increasing the DAU means to keep the user coming back as much as possible, for that I would like to share with you what I think are keys:

  • having an idea do not makes you an expert
  • never consider yourself a regular user, come on, you got the idea.
  • if you want to develop something new and you are having a gut feeling, it could be food poisoning.. you need features that are oriented to keep the users.
  • investing lot of time and resources to develop what YOU believe that is ok could be really expensive, it is better to invest the resources in a smart way in whatever is aligned to the users.
  • do you need A/B testing
  • because you need users to come back you need to develop what users want/need/expect. At this point you need to know that you are not alone in this task, there are tools for collecting the user feedback (like opinionlab, ideascale, userecho, feedbackify, survey monkey, google forms or any custom tool developed by your self
  • there are techniques like QFD, that could translate user requirements into functional requirements, PLEASE USE THOSE KIND OF TOOLS/TECHNIQUES
  • usability is really important, the UX (user experience) is also important, do not forget about that!
  • use your own data, direct marketing is a good option, take you data and profile your powerusers, try to get to more people like that, again.. BE SMART.


I would like finishing answering the question:

NO, YOU SHOULD NOT BECOME STEVE JOBS.

He was a visionary, one of his abilities was understanding people. You are special because of the abilities you have developed during the time, that effort that makes you shine over the crowd and everything you can do and everything you know.



Tuesday, July 2, 2013

keeping an eye in the platform

Sometimes we feel like our platform is not doing well, maybe we see slowness in the service. Well the answer is lies in questions like:
  • are you ready for growing??
  • what is the performance of your platform??
  • do you know where are the bottlenecks??
Well, in order to discover those answers, you must need an eye on your platform (maybe more than one), today it could be funny and simple, but especially very powerful because the amazing amount of tools available.

If you are using any cloud provider you will enjoy some basic metrics, but the power is in another place, i will give you some options and as usual you will have the task to choose whatever you like and makes you feel comfortable.
  • nagios, basically a nice alerting system, mainly use snmp to get metrics from servers and show you the status per hosts/service (ok, warning, critical).
  • cacti, it is just graphing, based in snmp you could monitor metrics.
  • munin, helpful to collect data because it handle a custom client/server architecture (munin-client and munin-server), this is not a graphing tool, it is for monitoring.
  • collectd, i am starting to feel love for this client/server option (like munin).
  • graphite, nice graphing server with many frontends available.
  • ganglia, data collecting system (client/server) with a really ugly frontend, but pretty fast.
This is a important point, YOU CAN COMBINE SOME OF THEM!!! Yes, you could use collectd+graphite+nagios. Anyway, as usual you should take care of scalability in two points: 1.- do not overload your platform and 2.- your monitoring platform must be ready to scale with you.

Please, monitor and enjoy understanding where are your bottlenecks and your performance opportunities. Remember that it is required collect and process data in order to have elements to analyze and make desicions