Friday, July 19, 2013

providers, providers, providers.. do not kill me, help me!

when you have a business, it is important to be careful with every aspect, but there are two that are really important to me:

  1. People that you are hiring
  2. Services providers
I use that order because i believe that it is the correct, at the end an organization (company, enterprise or however you want to call it) is a structure with people giving meaning to that structure and the most important element IS THE PEOPLE. 


the second point is related to service providers, that is because many of the business we are building today rely on services that we are hiring. History teach us that some providers could fail, for example having a blog in wordpress.com, maybe they could suffer a outage and take our blog down, that is bad but imagine having your core business app hosted in a place that goes down (in that point you could start crying). If you do not want to cry, what you will need are really reliable providers.

talking about providers, we expect help from them, having a cleaning company that leaves more trash than the one that is collected is not a good deal. Let's face it, we need more help than problems, that my friends is a good indicator of the quality of the provider (problems over help, it have to be almost zero).

Being very honest, problems happen, and we will have to deal with them, but what we do not need (want) is the same problems over and over.

I will give you two examples:

  • AWS, they are the first provider of cloud computing services, they have (in my personal opinion) one of the most amazing stacks of cloud oriented services. Obviously, they will charge you for breathing if they have a chance. They have premium support that could be something really expensive, for my it could be like USD 2000 monthly (OUCH). If you are not a member of the premium group then you are screwed!! You can open a ticket and pray for a solution in less 2-3 days or post something in the AWS forum and again pray for a solution. For example, they showed me in the AWS Management Console that I have a pending "EVENT", it was an Instance that was running in a "DEGRADED HARDWARE", one of the AWS solutions is Stop/Start the instance, well, i did that... the instance was stuck in stopping state for 5 hours. Thankfully it was not a really important server, otherwise that would mean a downtime of 5 hours, obviously, i reported the issue after 5 minutes but the solution arrived a "little bit" later. 
  • Time Warner Cable (TWC), do i need to say more?? Anyway, I will do it.. Imagine you have an Internet based product, so you will need a reliable connection, well, TWC offers connections of 30Mbps/3Mbps, 30Mbps/5Mbps, 50Mbps/3Mbps and 50Mbps/5Mbps (something like that). Usually, if you have the 50% of that, you are really lucky, and the trace routes could vary from 20ms per hop to 1200ms per hop in a five minutes period. By the way, if you call to support, they will give you reasons like "it is related to solar flares (really, they told me that once and my connection is coaxial)", "it could be a problem with the AMPLIFIERS (OMG, do i have to plug a headphone to the router??, where are the speakers)". Whatever they say the final sentences is the same.. "It should be fixed in a while, wait for a couple of hours", in real life, that means something like "stop bothering us because we are busy playing solitaire".
Did those examples sound like reliable providers??

I can give you some personal keys:
  • try to build your apps over disposable servers, so you could get over any kind of situation like the "DEGRADED HARDWARE" case. Volatile is key!
  • If you could not make it volatile, make it easy to recover, maybe with amazon AMI or any kind of machine images, backups with fast recovery and fast deploys. Chef is awesome, but having a recipe that takes 1 hour to run is a really BAD option, speed it up.
  • get redundancy, do not put all the eggs in one basket. For example, if you are in AWS, use multiple regions and availability zones in each region.
  • if you could combine multiple providers, DO IT!!



Thursday, July 11, 2013

get a backup before it's too late

During the last years we have learned about taking care of our code doing backups in zip files, then we evolved and we start using SVN to control changes and having a backup; today, we are majorly using GIT for version control and having the backups in a system providing much more functionalities around the main concept of a BACKUP.

That is really cool, if you are doing version control then you are doing something good. But, if for example you are running a website; well, you will have you a backup of your code in a repository and if you are using any Content Delivery Network (CDN) then you will have a backup of the files that could be sharing, but what happen with your database?

Well, obviously you will need a backup of your database, but usually as soon as you start growing then your databases start being bigger and bigger, really quickly you will be unable to do a dump of your DB (damn you scalability). Ok,  now I would like to make a list of some considerations prior to deploy a backup system:

  1. we should have downtime zero!! basically, the system operation should not be affected for any backup.
  2. Optimal use of the storage resources, if the database grows 100% weekly then you will face a issue having tons of megabytes of storage for backups. Having incremental backups sounds good.
  3. Backups should be done often, it is useless to have backups every month if the business change every day or every hour.
  4. Good use of the processing time, if you need to do 4 backups a day, and each backup takes 12 hours to be done then something is really wrong.
  5. Having the optimal system is irrelevant if you are not able to restore from a backup
  6. If you have incremental backups then you could have the option to restore to a point in time (maybe).
Taking in count those points, I would like to give you to options based on the systems that I am managing at this time.

First, I love using mongodb, the creator of mongodb have an amazing set of tools available for everyone, those tools are called MMS, 10gen started offering a backup service really good, I have tested it and I have had really good results, it is 100% recommendable.

Second, I am also a MySQL user, I have systems running in master/slave mode and I even have cyclic replication between two masters (this sounds really nice), so it is complex and it is BIG, some time ago i was testing a tool from Percona called xtrabackup, I started testing that tool under a disaster recovery scenario, after that, i stop using it; not because of the tool, it was just lazyness but it is time to start testing again. This tools is amazing, it is like the backup system of the heaven.

The price of the backups is not related to the backup itself, it is related to the price that you will pay if you do not have a backup.

I have given you many places to look and read for, if you have any other option please let me know. And enjoy having the inner peace because you have backups!



Wednesday, July 10, 2013

orchestration is one step for growing

When you are taking care of a platform, sometimes you need to do some tasks on a set of servers and everything will go like a charm if you have a few servers, but imagine that your platform grows from 10 servers to 200 servers, well, that could make your job a little bit awful.

Even if you have an image of the server, every time that you need a change you can create a new image but you will need to replace all the servers that are running or coordinate an update, obviously you could not do it manually, so you need any kind of orchestration. Orchestration refers to management and coordination of tasks in big systems.

At this point you could find many people suggesting tools, I would like to give an opinion that by the way is not final because I am still testing. As usual i will give you choices, so you could walk whatever YOU think is the right path.

Let's start with Puppet, this tool is awesome you can make almost a lot of everything, monitoring and orchestrating, but let's be honest, it sounds like too much, and it is.. probably, it will awesome but it will be like cooking an steak in a nuclear reactor. Anyway, you can give a chance to it.

Second tool, Chef, it is really amazing how can automate configuration, installation and keep track of configuration.. Actually, I am trying to fall in love with this but our relationship is complicated, you need to be very pragmatic, practical and organized. When you use a tool like that, you need to know when start doing in your own way, it is better to keep the recipes in the original way but any real life chef know that they can twist a recipe to make it taste like they want. 

My last option, do it your self!! For example, i have built a set of scripts in python+fabric that allow me to do many stuff against a set of servers in AWS, for example I can define common tasks, like clean of logs, and run it against all the webservers, or I can open a virtual console that give access to send custom commands to a set of servers even in different regions. Actually, i love fabric, i think that framework is AWESOME.

You need to test every option by yourself and take in count the size and characteristic of your business, there is no map of the treasure, there are clues and you have to build your own map.

Enjoy the ride!


Tuesday, July 9, 2013

Simple, Standard and Open

Today, i was at Google NYC attending to the Meetup about the Crisis Response project, I learned a couple of stuff there that i would like to share with you, I also remembered some old concepts, let's start recognizing that some people in Google is using their knowledge and some cool tools to help people, if you want to know a little bit more about that check the google.org site. GOOD JOB PEOPLE!!

Then, at the beginning of the tech talk the presenter showed a slide with this 3 words:

SIMPLE

STANDARD

OPEN

Reading that, I came with the idea of sharing my opinion about those words.

Talking about SIMPLE, we usually start to building stuff, coding, developing and we walk around many options or solutions, well, I will tell you something, you should be trying the simplest one, you do not need to create a time machine to get the current system time... please KEEP IT SIMPLE!!

In second place, let's be standard, if you have a proven solution please use it, after having your product up and running you will have the chance to invent stuff just to replace those that could work better, but the secret is here.. USE THE STANDARDS, if the specification of a product says that you have to do X, then why are you trying to do 1/X, if the language is object oriented why are you trying to create a procedural code. Be STANDARD

Openness, do not refer only to open source products, it refers to be open to collaboration and be open to collaborate, if you have a problem ASK, if you found a solution PUBLISH it. Also remember that most of the time you are working with other people, so please be gentle and comment the code, create READMEs, etc.

Monday, July 8, 2013

the nightmare of shared passwords

Today, my title is clear, every time that someone joins the team of operations (call us devops, sysadmin, etc) we need to start sharing passwords for different services and in that point we start to listening the glove armed with razors scratching a pipe (just like Nightmare on Elm Street), and now I will scare you to death, imagine that after months giving passwords for services, one of the members of the team leave the company, OMG!! kill me now Mr. Krueger!!

Well, I have some Password Managers that could give us clues and help us, but keep in mind that i am not 100% comfortable with any of them, I would like something more custom adapted to my concepts of safety, simplicity and usability.

We have three choice:

  1. Hosted services, and by hosted I am talking about a service that someone provides (here my spidey sense start to alert me about someone else having my passwords)
  2. Standalone apps, well nobody have my passwords, I could share the password db but keeping the synchronization is really sucky
  3. Mixed environment, now we are talking, this sounds more like a good choice.
Some of the tools available includes:
Some of them hosted like CommonKey, some standalone like KeepassX and some of them like 1password that can be setup to store the DB in Dropbox.

This is my wish list of functionalities:
  • Privately hosted
  • Clients for multiple platforms (mac, linux, windows, android, ios)
  • Secure communication between clients and server
  • Access control
  • Easy synchronization
  • Easy modification of information stored in the DB
  • Secure storage of the information
At this point, my choice is a kind of hybrid solution manually built on top of Dropbox and using 1password as client, at the end the idea is to be able to revoke access or grant it as easy as using Dropbox. The bad part is that you need to pay for 1password and you depend on Dropbox.

As an extra point let me tell you that keeping the password in that way could give you an amazing option when someone leave the organization because you have a list of the password to change and a direct way to share the new ones.

I hope that this post give you more ideas than solutions and if you decide to solve this problem, count with me as developer and tester.

Wednesday, July 3, 2013

my app is not growing, should i become Steve Jobs?

Today, when I was having lunch with a really good friend, one of our topics of conversation was about the process of getting more users and keeping them using the app. Let's face it, the only reason we can increase the valoration of our softwares is based on how many user we have and the retention. Here, I would like to quote “The Lean Startup”, Eric Reis talks about the importance of measuring as a part of your product development cycle. Seeing how your users react to and interact with new features and ideas should influence your decisions about those features and ideas.

That is a key, we need to measure things like:
  • Daily Active Users (DAU)
  • Monthly Active Users (MAU)
  • New Users
  • Retention
Those are really important values that show you if you are on the right direction, but knowing the retention do not give you the answer to what to do, lets suppose that the users are not coming back to your website, you will see a drop in the DAU, so do you know what to do? if you want to solve the DAU drop you will need to find out how to get your users back. So this is the answer to your problem:

you need that the user to likes your app

It is kind of obvious that increasing the DAU means to keep the user coming back as much as possible, for that I would like to share with you what I think are keys:

  • having an idea do not makes you an expert
  • never consider yourself a regular user, come on, you got the idea.
  • if you want to develop something new and you are having a gut feeling, it could be food poisoning.. you need features that are oriented to keep the users.
  • investing lot of time and resources to develop what YOU believe that is ok could be really expensive, it is better to invest the resources in a smart way in whatever is aligned to the users.
  • do you need A/B testing
  • because you need users to come back you need to develop what users want/need/expect. At this point you need to know that you are not alone in this task, there are tools for collecting the user feedback (like opinionlab, ideascale, userecho, feedbackify, survey monkey, google forms or any custom tool developed by your self
  • there are techniques like QFD, that could translate user requirements into functional requirements, PLEASE USE THOSE KIND OF TOOLS/TECHNIQUES
  • usability is really important, the UX (user experience) is also important, do not forget about that!
  • use your own data, direct marketing is a good option, take you data and profile your powerusers, try to get to more people like that, again.. BE SMART.


I would like finishing answering the question:

NO, YOU SHOULD NOT BECOME STEVE JOBS.

He was a visionary, one of his abilities was understanding people. You are special because of the abilities you have developed during the time, that effort that makes you shine over the crowd and everything you can do and everything you know.



Tuesday, July 2, 2013

keeping an eye in the platform

Sometimes we feel like our platform is not doing well, maybe we see slowness in the service. Well the answer is lies in questions like:
  • are you ready for growing??
  • what is the performance of your platform??
  • do you know where are the bottlenecks??
Well, in order to discover those answers, you must need an eye on your platform (maybe more than one), today it could be funny and simple, but especially very powerful because the amazing amount of tools available.

If you are using any cloud provider you will enjoy some basic metrics, but the power is in another place, i will give you some options and as usual you will have the task to choose whatever you like and makes you feel comfortable.
  • nagios, basically a nice alerting system, mainly use snmp to get metrics from servers and show you the status per hosts/service (ok, warning, critical).
  • cacti, it is just graphing, based in snmp you could monitor metrics.
  • munin, helpful to collect data because it handle a custom client/server architecture (munin-client and munin-server), this is not a graphing tool, it is for monitoring.
  • collectd, i am starting to feel love for this client/server option (like munin).
  • graphite, nice graphing server with many frontends available.
  • ganglia, data collecting system (client/server) with a really ugly frontend, but pretty fast.
This is a important point, YOU CAN COMBINE SOME OF THEM!!! Yes, you could use collectd+graphite+nagios. Anyway, as usual you should take care of scalability in two points: 1.- do not overload your platform and 2.- your monitoring platform must be ready to scale with you.

Please, monitor and enjoy understanding where are your bottlenecks and your performance opportunities. Remember that it is required collect and process data in order to have elements to analyze and make desicions

Monday, July 1, 2013

may the IDE be with you

last week, following the recommendation of +Pablo and +Omar, I decided to give a chance to PyCharm. Well my friends, it is a charm! the IDE really does the work. When I am refering to "the work", I am talking about support, speed in the development, contextualization and HELP!! It has good PEP8 formatting, code analysis, code assistance, refactoring and more.

we develop IDEs because we have the need to increase our productivity as much as possible; some of us feels comfortable using a simple text editor, well.. not so simple if you (like me) are using Sublime Text, you have to see the demos and all the configuration options like the available in Alex Maccaw's Blog in order to fall in love with this editor.

anyway, the important part of history is this:

IDEs are made by developer for developer, that try to put their needs in automated tools, those needs that many of the times collide with ours. So, try some IDEs, pick whatever helps you to be more productive, remember that for us IDEs are productivity tools. And  do not forget:


may IDE be with you