On The Importance Of Iterations In (Product) Development

In LaughGuru, we received designs from the UI designers on Zeplin, a nice tool that bridges the gap between designers and developers. While I wouldn’t know about designers, it definitely made my life (as a developer) easy. My previous experience with frontend was with designers who gave me Photoshop PSDs and that was it. A Photoshop noob was expected to extract layers among other things and think about mobile responsiveness and all of that. I’ll leave you here saying it didn’t end well. But I digress.

Recently, I read the all-time classic ‘The Mythical Man Month’. It had an interesting line that I even wrote in my nice quotes file. It goes like this: “Plan to throw one away; you will, anyhow.”. It says something about planning to throw away the initial iteration of any product that you’re developing, because even if you don’t, you’ll throw it regardless. While that isn’t very agile and might not apply in its entirety to most of the software development happening today (to be fair, it was initially written in 1975), it had an interesting idea about how the first version of the product needs major reforms (or rewrite) to end up as something usable and/or beautiful. You can’t avoid that. You can only prepare for that.

Coming back to why I mentioned LaughGuru because I just remembered that. The first version that we received from the designers wasn’t very impressive. It lacked colors and looked bland to my ‘taste’, whatever it was at that time. But my manager had something else on his mind. He seemed to focus on the data flow and UX parts of things rather than the colors, unlike me. We went ahead and implemented it anyway, but since there was no design-design, most of the focus was on functionality, routing and data flow. It became the beta of our product’s latest iteration.

Later, we received newer designs which had colors. We discussed and later implemented them. This time, I could fully focus on the designs part as functionality was already done and tested. We did decently well and a couple of iterations later the end result was nothing short of beautiful. I used to tell my manager Look! it has started to look even better now, and he usually replied, unimpressed and with the typical attitude of a badass that he was, Yes, that’s kinda the idea. Of course, separation of concerns works. I was proud, and an important lesson got reinforced within me. The first iteration is usually ugly and sometimes embarrassingly silly. But it tells nothing about the practicality or viability of the underlying idea. It tells even less about if the idea will ever be successful. So why abandon it or feel dejected if the initial iteration draws negative feedback. Instead, learn from the criticisms and stick to the larger goal of why you started it in the first place.

I was recently working on my portfolio site because, to be honest, having this blog as a site to represent me and the kind of work that I do doesn’t really work with anyone expect developers and engineers, and that isn’t a very nice (read: accessible) thing to say about a website or yourself. I wanted something fancy and shiny, so I made that. Following are some don’t dos in any software project. I tried to focus on too many at once (content, UX, UI). I tried to get everything straight in one iteration, or roughly a day. I had no plan for the development, or any vision about what the end result should look like. I had little idea about what audience I’m targeting. The end result was something I wasn’t very proud, but I still asked my friends for feedback.

While the feedback was in general useful, a friend who was particularly into UX and UIs decided to guide me a bit (I’m sure he must’ve had taken a deep breath on seeing my work). He completed a course on product design on Udacity which mentioned that UI designing is never complete. You will always find ways to improve, and at a point, you’ll have to stop yourself once the design requirements are settled. But the room for improvements is always there. Another useful lesson. We went on to make small changes in the page, and with each little change, such as making a heading bold or adding some padding below an element, the page looked better than it looked previously. Within a day or two, the page was something that I’d proudly associate with myself.

So to close this one, let’s review the lessons we learnt:

  • Your product’s first iteration is going to look like shit unless you decide to launch it too late. Don’t get saddened by its lack of ‘perfection’, for you’ll have plenty of time to do it down the line. Focus on the basics like why you started it all and what purpose it is supposed to serve.
  • Separation of concerns is important. It is easy to get overwhelmed by the things to do, so have a plan and don’t put too many things on your plate at once.
  • It takes time to create anything worthwhile. You cannot get something right in the first iteration itself. It is against the laws of nature. Evolution required some four billion years to create the beautiful individual that you are. Don’t rush it beyond what is possible and practical.
  • In UI development, things are never done to a 100%. You try to improve as much as possible through iterations, meeting business requirements and personal standards and then going well beyond that. At a point, you call it a day and move on to the next challenge.

I hope you enjoyed this little article. Thank you for reading.

17 Tips After A Year In Javascript Development

Long time no see, Javascript. Where had you been, my old friend?

Well, not really. I have been writing Javascript more than ever since my last post about the language on this blog, some 26 months ago. As some of you already know, writing Javascript has become my day job and it is honestly one of the better things that could’ve happened to me. I’ve learned a lot of the nitty-gritties of the language and surrounding environments (and the web) and gained invaluable insights about software development in general.

In this post, I aim to cover some of things I wish I knew as a Javascript newbie (no offense to anyone. I’m a newbie myself in most ways). These aren’t supposed to be comprehensive or even correct, for that matter. No particular order is followed. These are just things I feel I could’ve helped myself with while learning the language and building some initial projects and wish to share with you guys here.

It will be most helpful if you’re just finishing learning the core language and want to find out what exists in the world of web development. Feel free to contradict and correct me.

1. All Theory And No Practice

Get over with that college student attitude of reading chapter upon chapter of dull text and not writing a single line of code. And by code, I don’t mean the example code snippets. Go out there on Github and Reddit and search for beginner problems statements. Creating a simple TODO list or displaying the top posts on Reddit frontpage via their API doesn’t require any deep knowledge.

The essential skill to learn here is to be able to use a search engine, find relevant documentation, getting passive help from online communities (See stackoverflow or Reddit; r/javascript, r/webdev, r/frontend are good places to start looking). You cannot learn cycling by just reading about cycling. You learn cycling by riding a bicycle, by talking to your friends about the dangerous stunt you pulled off, by going offroad on the weekends, by watching the pros ride and then trying those things. That’s what makes it fun. Don’t make it a job. More often than not, you’ll suck at the end of it.

2. Build Tools

I made the mistake of assuming I knew everything about the language once I finished my guidebook. The reality was far from it. Language is just one little part of the puzzle of mastering any programming stack. There are many support tools that I didn’t care to explore back then. For example, there’s something called as a build tool.

In an organizational setting, you don’t start a project by creating an index.html and then linking it with your main.js. Instead, you make use of build tools (See Webpack or Grunt) to compile your code into a form understood by new and old browsers alike, bundle all of your code together and make it production ready. A basic understanding of build tools will go a long way and come in handy when making your next hobby project shine.

3. Modularization

Modularization, in simple English, means separating parts of your code that work similarly. Just like we sleep in our bedroom, bath in our bathroom and cook food in the kitchen, code the does similar things should be separated into different files and functions/classes. fetch() API calls in one file, little utility functions/hacks in another, DOM manipulation functions in another and event listeners separately. This is one of the many ways of doing it. See what works best for you. Remember that your frustration level two weeks later while finding a bug in your code is inversely proportional to the number of files and functions you separated your source code in.

The importance of modularization strikes only when you feel the pain in your back scrolling up and down and sideways finding a particular section of a function responsible for something. You’ll have to take a leap of faith here to learn this lesson the easy way.

4. Consistency

You don’t have to be collaborating with other humans to follow this. Just pick up a code style and stick to it religiously (see Airbnb’s Javascript style guideline for a starting point). You’d hate yourself if you open your code editor and find things like varNames and var_names mixed all over the place. This is not limited to just syntax or micro stuff. On a broader level, follow the same style for everything. If you return a Promise from an API service function, then stick to that style. Don’t interchangeably return promises and data, for example. And don’t mix imperative, object-oriented and functional coding styles randomly. Make conscious calls on what is right for the situation, and document appropriately.

The less you think about the language semantics, the more you can think about the problem you’re trying to solve. And consistency will definitely reduce a lot of the unnecessary load of deciphering your own code.

5. Documentation

Another task deserted of any love in the realms of software engineering. No one likes to document code, take it from me. But the difference between a rookie and a master is knowing when and why something must be done irrespective of his affection towards the job. Write comments, doc strings. Decorate your projects with nice READMEs and fancy “build passing” and “codecov” badges. No, you’re not doing it for others. You’re doing for yourself because you know it is right. You want to know your thought process two months later when you read the code to make some changes. It is like an “enable time machine access to this point in time?” button in your code, which you reply with a “deny” if you decide to not document your code.

6. Think, Think, Think And Then Write Code

This is one of my personal favorites. I’ll start coding 0.006 seconds (number might be slightly inaccurate) after hearing the problem statement. Don’t do this mistake. Don’t fall for your heart saying it has understood it all. Your heart is wrong. Think about the problem. Confirm that you have understood the requirements correctly. Formalize the requirements in your mind, or better yet, in a Github issue or something. Come up with a naive approach to do it. See if you can optimize the code, think about space-time tradeoffs, think if you can use something like a hashmap etc. You’ll be surprised how many better solutions you’ll come up with after thinking about it for 10 minutes as opposed to jumping right into the code.

7. Tracking Your Project’s Progress & Version Control

Use git, Github (or any other Git hosting). Use issues to write down what you plan to do for the next iteration. Use PR (pull request) feature to fix that issue. Write down if there were any caveats in the comments below the PR requests. Yes, do it even if you’re the only person working on the project. It might seem unnecessary and tedious, but once you get the hang of it, your time will be much better utilized and progress much better tracked.

8. Automate Your Deployment

I remember my workflow before starting work. Keep committing to the master branch and testing things out locally. And when things look decent enough, spin up a digital ocean droplet, clone the repository and start the development server there. That was my ‘deployment’. If any changes need to be done, commit to master and push from local. Pull them on the server and restart development server.

Now that I know better, I not only make separate branches for each feature, I set up auto deployment on Heroku on particular branches (mostly master and staging/development). It’s a one-click affair (assuming you followed the build tools part) and takes away a lot of hassles (and is free as well). Knowing the right tools can make a huge difference, as you just saw.

9. Unit Tests

If you’ve ever worked on a non-trivial project (say around a thousand lines of code), you will recognize the tragedy when you modify your code while adding a new feature and accidentally break a lot of the old code which you realize later when things start going south at which point a lot of git-fu is required to get back to a safe point and not lose any newer work. Let me tell you something. It is all avoidable with simple unit and integration tests. Think of tests as employing a bunch of robots to look over your old code and point out when you’re about to break your own (or someone else’s) code.

It is very hard to convince yourself to write tests even after knowing the pluses, let alone without knowing them. But again, you’ll have to take a leap of faith here. Or you can wait until someday while casually refactoring some code you mess something up which you realize only after three days with hours of debugging which could’ve been avoided by writing a simple test.

10. Use ES6

This might be a no-brainer to most of you, but I still find people writing older Javascript syntax even though they know and understand the newer spec. With modern build tools and polyfills, most of the times not writing the latest spec syntax is just laziness. If you find yourself writing things the old way, force yourself to use the newer ways. They are there for a reason.

Let’s be honest guys, Javascript is not the most elegant language. The older versions are riddled with bugs in the spec itself (see function scoping of var, the this keyword, double equals (==) comparison and automatic type coercion). The newer versions of the language are an attempt at fixing things and we should respect that, even if that means going out of your way to follow the newer semantics.

12. Understand What JS Is And What It Isn’t

Javascript, the core language, is actually a subset of the things we’re used to of using in our everyday development on the frontend. The core language is relatively small and simple, but add DOM APIs to it, and suddenly it becomes something slightly bigger and a bit more complex. Javascript is single threaded but DOM APIs like fetch (or xhr) and setTimeout are asynchronous because of the browser’s magic. Learn to consciously differentiate between them as that will come in handy. As a general rule, DOM is very slow as compared to the Javascript language engine.

13. You Probably Don’t Need That Framework

This has been repeated enough that I could’ve skipped it, but I’ll still put it out here. If you find yourself importing Lodash or Underscore for simple things like mapping, iterating over objects, filtering objects and things like that, consider looking for equivalent core Javascript alternatives. The latest ECMA spec adds a lot of helpful data structures and utility functions that make writing functional code very easy and pleasant.

Similarly, if you are planning to use ReactJS or Vue just because everyone seems to be talking about it, don’t. Go read up their introduction and documentation and judge if your project fits the use case that particular framework is build for and the problem it is trying to solve. For learning purposes, you can consider using the frameworks for trivial apps, but for anything else, make decisions on a strictly use case basis.

14. Learn Basics Of CSS

While the popular opinion still says that CSS is a pain in the back to master, I’d suggest learning almost 60-80% of it with maybe 20% of the efforts. No need to remember which browser supports what niche attribute. Just the important very-handy-in-real-life parts like flexbox, grids, box model, margins, paddings, positioning related, display related, floating related and so on. You will cover a lot with just these, and they come in super handy for personal projects. (Again) trust me.

15. Explore Web Workers/Service Workers

What if I tell you, you can hijack API requests from your webapp and service them without making a roundtrip to the server? You’ll say, if the request is serviceable on the frontend itself, why bother making an API call in the first place, and I make a poker face to that. Jokes aside, service workers enable you to handle network connectivity drops and provide a much smoother experience. You can save the data that a client wanted to send the server in the browser while offline and then replay those requests again when network connectivity is recovered.

Web workers allow your code to delegate work to another processing thread. This can be any type of work that you would want to unload off the main thread like heavy calculations etc.

16. Explore Browser-Side Databases

There’s localstorage, there’s cache, there’s indexDB. This is apart from the cookies and session headers. Basically, there is no shortage of storage options on the client side. Try and use these to cache and store data that need not be sent to the server, or maybe cache in case of uncertain network availability. With service workers and Cache, you can fine tune what gets cached and the behavior when a cache is hit or miss. You can even set up caching strategies with plain Javascript code logic.

There are wrappers such as PouchDB that can sync with the server side CouchDB and provide nice real-time syncing capabilities. Speaking of real-time, you might also want to check out Firebase by Google. I used to use it in the past and it was very easy to set up and work with.

17. Explore Progressive Web Apps

Now you can make your website work offline too. Ever seen that “Add to homescreen” button on certain websites? When you add those websites to your homescreen, you can then open them like regular Android apps and if they’re built that way, even use them offline (further reading: developers.google.com/web/progressive-web-apps).

In Closing

I hope this article was useful. If you’re a beginner with web engineering and have just finished learning the Javascript language, you might want to explore some of the above points in detail. I’m sure you’ll have fun building your next awesome project using these guidelines.

Thank you for reading.

OverTheWire Bandit 27-33 Write-up

The last part of the Bandit challenges was relatively easy with most of the flags attainable with basic git knowledge, except for the last restricted shell escape. Try them here: OverTheWire Bandit

Bandit 27-28

This is as simple as it can get at this stage. Just clone the repo and cat the README.md file. The flag is in plaintext.

Bandit 28-29

In this stage, if you cat the README.md file, you’ll find xxxxxxx in the place of the flag. If you do a git log, you’ll see that the password was entered and then removed. Just checkout the previous commit with git checkout {hash} and you’ll have your flag in the README.md

Bandit 29-30

There’s no commit history this time, and the README.md file says “no password in production”, which is a clue. Do a git branch -r and you’ll see a development branch. Checkout into it (git checkout dev). cat README.md in this branch to get the flag.

Bandit 30-31

No password in previous commits or branches here. But if you do a git tag, you’ll see a tag called “secret”. Do a git show secret and you have your flag.

Bandit 31-32

Add and commit any random file, remove the wildcard entry from .gitignore and push origin. The flag is in the verbose output of the commit.

Bandit 32-33

This is a restricted terminal escape challenge, very interesting. I urge you to think of creative ways of loopholing this before looking at the solution.

So the terminal converts every command into uppercase before executing. So ls becomes LS and cd becomes CD and nothing works.

One way of loopholing this behavior was symlinking a helper binary to an all caps name. I choose vim for the purpose, but cat, less or more, anything would’ve worked. Symlink the binary in your temp directory in some all caps name.

$ ln -s /usr/bin/vim /tmp/mytempdir/VIM

Now, simply running ./vim will execute VIM and you can then read the flag file with :r /etc/bandit_pass/bandit33 in vim.

Thank you for reading

HTTPS By Default

There was a time, not too long ago, that I used to dream of running a website that had HTTPS written in the address bar. Most big websites had it, but very few little ones and personal blogs did. I couldn’t because it required a lot of money (for a student, at least). Today, thanks to the efforts by Let’s Encrypt, Cloudflare and many other organizations which gave away basic SSL certificates for free, all of the domains I own are on HTTPS. But that’s not even the best part.

The best part is that HTTPS is now seen less as a luxury by small website owners and more of a necessity. Part of the reason for this is Google’s penalty in terms of search rankings for websites without HTTPS. The other part, and this is more important for the little non-business blog owners who do not care much about traffic and SEO, Google Chrome (>= 68) has started displaying a Not Secure for non-HTTPS websites. I would say that Google is doing a great job at providing people an incentive to switch. I hope Mozilla Firefox follows in Chrome’s footsteps in this regard.




non-https website on Google Chrome


non-https website on Mozilla Firefox

On similar lines, Github now only shows the entire web address (along with the protocol) if it is on HTTP. If the link posted is HTTPS, Github will truncate the protocol part and only show the domain name.

For example,

https://example.com

will be rendered as

https://example.com

while

https://example.com

(notice https) will be rendered as

example.com

If you notice, this is opposite of what happens in browsers today, http is truncated and https is emphasized.

I personally hope more and more website and browsers treat HTTPS as the default and HTTP as the exception, and not the other way round. The merits of using HTTP over HTTPS are either obsolete or negligible (read more about HTTPS here: ELI5 – How HTTPS Works). On the other hand, if done properly, HTTPS can be much faster than HTTP.

We already know that the more people using encryption, greater will be the overall value of using it. I’ll end here with a quote by Bruce Schneier, a cryptology badas- expert.

Encryption should be enabled for everything by default, not a feature you turn on only if you’re doing something you consider worth protecting.

This is important. If we only use encryption when we’re working with important data, then encryption signals that data’s importance. If only dissidents use encryption in a country, that country’s authorities have an easy way of identifying them. But if everyone uses it all of the time, encryption ceases to be a signal. No one can distinguish simple chatting from deeply private conversation. The government can’t tell the dissidents from the rest of the population. Every time you use encryption, you’re protecting someone who needs to use it to stay alive.

https://www.schneier.com/blog/archives/2015/06/why_we_encrypt.html

Thank you for reading.

System Stability & Moving Back To XFCE

One thing that I really hate, and I don’t use that word very often while describing my computer preferences, is system crashes. It’s one of those things; just unacceptable to me. You’re working on something important, and all of a sudden, the DE (Desktop Environment) decides that it needs to restart itself, and you lose all of your windows, terminals and most importantly, context. Coming back from there is a 15-minute process in itself; logging back again, starting the browser, IDE, terminals, entering virtual environments, running test servers and so on. As you can tell, it can escalate from slight inconvenience to very frustrating in little time.

When I got my new laptop back in May, I decided to switch away from XFCE. To be honest, I did try installing XFCE but couldn’t due to some issue starting the DE. Since this was a fancier laptop with better hardware, I assumed I can afford running a somewhat heavier DE for a better user experience (and my colleagues’ Macbooks constantly reminded me that I’m using an ancient (looking) DE). I did some research and was split between KDE and GNOME 3.

So the initial impressions of GNOME 3 were not very convincing (not that this was the first time I tried GNOME 3 anyway). I never liked the gesture-like way of accessing windows and quick menu. I’m more of a click-click person. But I decided to stick with it and see how it goes, customizing whatever that I can. So after that, things went uphill for a while. The more I used GNOME, the more I started to appreciate it. I brought back the ‘conventional’ application menu, quick access bar on the left side with an extension called Dash to Dock, Pomodoro and a bunch of widgets for the top bar (which by default is mostly empty).

A few issues persisted from the beginning. The most important one was memory and CPU usage. I looked up and concluded that it is a general problem and not just my laptop. The problem is not just the high usage of system resources (which is even a good thing if you trust the kernel). Problem is when you see gnome-shell constantly use one CPU even while idling and 500MB-1GB of data just after startup. Now, due to this, I was constantly facing situations when the RAM would go over 90% and the system would start to lag. This was serious, but this wasn’t the worst part.

I could’ve lived with a (little) laggy system, a system that lags while opening the app drawer, for example (tip: create shortcuts to all the apps that you frequently use to avoid GNOME 3’s app drawer altogether), but the DE would also crash all of a sudden, wasting my time re-spawning everything. And it was especially bad when it happened during my work hours. That was a deal breaker. I tried to debug it, but couldn’t convince myself to spend more time on it as it wasn’t making a lot of sense. I installed XFCE, made it work and it felt like I’m back home to my countryside house from a vacation in the city. Felt good.

In conclusion, I think I’m biased here. I had a preconceived notion about GNOME 3, and I might have fallen for that. Maybe GNOME 3 is objectively better at many things that my bias didn’t let me see. Don’t get me wrong. GNOME 3 is a wonderful DE, and for someone who values the bells and whistles that come with GNOME (I had three finger and four finger gesture support for once in my life. Thanks, GNOME) I think it is a perfect choice. For me, however, system stability is way too important than any secondary convenience feature.

An interesting thing I saw Mac OS users do was that they used to always suspend their machines, not shutting them down often. I wanted to do that since forever, just close the lid and be done with it. I never could do that on my old laptop because new issues would creep in after resuming from suspended state (failure to connect to wifi, display staying all black, USB ports not working etc to name a few). No such issue is present on my Thinkpad, and as a result, I suspend it in between use and at night. The system is rock solid, even at heavy loads. As an enthusiast, it gives me a lot of pride in mentioning my last shutdown was nearly ten days ago.

22:00:09 up 9 days, 11:17, 1 user, load average: 0.84, 0.58, 0.59

I’m sure a lot of you reading this can relate to the pride of showing off uptimes and talking about system stability, or the joy of keeping your car running in top notch condition after years with proper service and care! This is similar. Hope you found this interesting. Thank you for reading.

Backend Development With Flask

Context

I distinctly remember how much I liked writing backends (well, I tried). I also remember thinking, I will never become a frontend engineer. Ever. Like many software developers in my circle, I hated writing CSS, and JS hadn’t clicked until then. And these were the days before I knew anything about automated deployments. Deployment, for me, was spinning up a Digital Ocean droplet, installing everything manually, setting up the database (and copy pasting the db creds into code), and then running the development server and keeping it running. Stop laughing, please.

Motivations

So I decided to pick up some backend skills again. The main reasons for this were that I’ll need a full-time job soon, and it would be much better if I get to write the full stack of the web instead of just the frontend. Secondly, knowing full stack is a superpower that I’d not want to not know. It comes in incredibly handy when working on some personal projects that require the web as an enabler. Thirdly, I wanted to refresh my Python skills. I liked python a lot back in the day, but I had lost touch since the last couple of years. So with those goals, I started with Python and Flask.

Python & Flask

I liked flask, but I’m still struggling with some basics, especially working with app instances for writing tests. Yes, there are a few differences working with the backend this time versus like three years ago. I’m following (or trying to follow) the best practices, writing tests for the code, I have a nice CI pipeline which starts with testing of the code and ends with deploying the app on Heroku. Most importantly, I have an excellent mentor for my backend adventures who’s a badass python programmer.

Flask is fun to work with. The framework is minimal, kinda like React. There’s a lot of support online, and there are great plugins already available for most common functionality. Databases are one of my weakest points in web engineering, and I’m trying to experiment a lot of things on the model layer with SQLAlchemy and Postgres backend. One more novelty for me was asynchronous programming. In Javascript, you had to beg for things to be synchronous. But here you face a different problem; If something is slow, then the entire thread is blocked. For taking care of things that are slow, say sending an email, one could use Celery with RabbitMQ’s backend. All of this is given to us ready-made by Heroku. So no more manual DevOps work, and fewer variables to worry about.

The other motive was to learn quality Python 3. In python, you have a pythonic and many non-pythonic ways of doing things. There’s no point in writing Python like C. I wish to learn the philosophy of the language so that I can make the right choices when deciding how to solve a problem. There’s nothing like writing clean and elegant code that others can appreciate. In the last week, I also got exposed to a lot of different data structures that the python library provides for specific use cases. In the right scenario, using an appropriate data structure can be the best optimization you can do to your code. I am looking forward to getting a hold of this as well.

Lastly, one thing that I never thought I’d so, but I’m doing, is trying to learn object-oriented programming. I had done some OO python in the past, but it never clicked. Laster, with JS, it was all about functional programming. Now, I’ve rediscovered OO programming and wish to relearn it, apply it and try to make it click. I like the contrasts in both paradigms of programming, and it could not be better explained than this StackOverflow answer.

  • Object-oriented languages are good when you have a fixed set of operations on things, and as your code evolves, you primarily add new things. This can be accomplished by adding new classes which implement existing methods, and the existing classes are left alone.
  • Functional languages are good when you have a fixed set of things, and as your code evolves, you primarily add new operations to existing things. This can be accomplished by adding new functions which compute with existing data types, and the existing functions are left alone.

Overall, I feel I’m becoming a little (very little indeed) mature with programming. Instead of sticking to paradigms and trying to defend the one that I’m most comfortable, I’m trying to see why those paradigms exist and what problems are they helping me solve. And since python supports both object-oriented as well as functional programming, it will be fun to work on any such problems.

I would write some technical articles on the subject when I feel confident enough in the near future. Just wanted to give you an update on what’s happening on my front in this one. Hope you found it useful. I’ll leave you with an interesting video about ‘Duck typing and asking forgiveness, not permission’ which is a design pattern in Python. Thank you for reading.

ELI5 – How HTTPS Works

Let’s start with some basics. Just like when you want to talk to another person, you talk in a language that both of you understand, every system that wants to talk to another system needs to talk in a commonly understood language. Technically, we call it a protocol. HTTPS is a protocol, and so is English. Every protocol is designed with some goals in mind. For real-world languages, the goals are simple. They are usually communication, literature and so on. With computer protocols, the goals have to be more stringent. Usually, different computer protocols have very different purposes. For example, File Transfer Protocol (FTP) was (and still is) widely used for transferring files, Secure Shell (SSH) is used for remote administration and so on.

Note that we’re only talking about application layer protocols in the Internet Protocol Suite. Once the appropriate protocol in the application layer creates a packet for transmission, this is encapsulated in many coverings, one by one, by all the layers beneath it. Each layer attaches its own header to the message, which then becomes the message for the next layer to attach its header on. A reverse of this process happens on the recipient’s end. It is easier to imagine this process as peeling of layers of an onion.

So having that set, we’ll start out discussion about HTTPS. HTTPS, or HTTP Secure, is an application layer protocol that provides HTTP traffic encryption using TLS (Transport Layer Security) or its predecessor, SSL. The underlying application doesn’t have to worry about HTTP or HTTPS, and once the initial handshake is done, for the most part, it is just an HTTP connection, one that is over a secure tunnel. I’ve been a frontend engineer and I’ve never written any specific HTTPS code, ever. That’s the magic of TLS.

What’s TLS?

So HTTP that is encrypted using TLS is HTTPS. Got it. But what about TLS then? For starters, TLS is a hybrid cryptosystem. It uses multiple cryptographic primitives underneath its hood to achieve its goals.

Aside on cryptographic primitives: Cryptographic primitives, like symmetric encryption, block ciphers and so on are designed by experts who know what they’re doing. The role of protocol implementers is to take these primitives and combine them in useful ways to accomplish certain goals.

TLS uses symmetric key encryption, asymmetric key encryption, and (sometimes) message authentication code to establish an encrypted bidirectional data tunnel and transfer encrypted bits. We’ll try to explore how each primitive is used to attain some goal in a bit. With these primitives, particularly with public key infrastructure (PKI), TLS establishes the identity of one or both the parties involved in a communication (your browser and the web server in most cases). Then, a key is derived at both the ends using another primitive called Diffie Hellman or RSA which are asymmetric key crypto algorithms. Once the key is derived, this key can be used as the session key to be used in symmetric key algorithms like AES. If an authenticated encryption mode is not used (such as GCM), then a MAC algorithm might also be needed (such as HMAC). Also, a hashing algorithm (such as SHA256) is used to authenticate the initial handshake (and as a PRF if HMAC is used). Let’s try to follow a typical HTTPS handshake and see what we learn during it.

In the beginning…

In the beginning, there was no connection. You open your browser and type in nagekar.com. The following things will happen in that order, more or less.

  • Your browser send a DNS resolution request for nagekar.com.
  • Your router (or any DNS resolution service) will provide you with the IP address of the host
  • Now the three way TCP handshake that we studied in our networking classes happen (SYN -> SYN/ACK -> ACK).
  • After establishing a TCP connection, your browser makes a request to 104.28.11.84 for host nagekar.com. The server responds with a 301 Moved Permanently as my website is only accessible over HTTPS and with the WWW subdomain.
  • Now starts the TLS handshake. First client sends a client hello. It contains the following important pieces of data:
    • A random 28 byte string (later used for establishing session key).
    • Session ID (used for resuming a previously established session and avoiding the entire handshake altogether, 0 here because no previous sessions found).
    • Cipher suites supported by the client in order of preference.
    • Server name (this enables the server to identify which site’s certificate to offer to the client in case multiple websites are hosted from a single IP address, as in the case with most small/medium websites).
  • Then server sends a server hello which has the following important pieces of data:
    • Another random 28 byte string (later used for establishing session key)
    • Cipher suite selected by server (in our case, the server selected TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 which was our most preferred cipher suite)
  • At this point, both client and server have the necessary information to establish an encrypted tunnel, but one important detail is missing. No party has verified the identity of the other party (and if not done, it really defeats the purpose of whatever came before this since an active man-in-the-middle adversary could easily break this scheme). This is done in the certificate message. In most cases, only the client will verify the identity of the server. Here’s how it looks like:
  • And this is exactly what you see when you click on the green padlock icon in your address bar and try to see more information about the certificate offered by the website.
  • At this point, the server hello is done. It is indicated in the message that the server won’t be asking the client for a certificate.
  • The server sends one half of the Diffie Hellman key in a separate Server Key Exchange message. Following this, the client sends other half of the Diffie Hellman key exchange. After that, the client sends a Change Cipher Spec message which means any subsequent message from the client will be encrypted with the schemes just negotiated. Lastly, the client sends the first encrypted message, an encrypted handshake.
  • On similar lines, server issues the client a Session Ticket which the client can then use to resume connections and not go through the entire Diffie Hellman procedure again (although it is valid only for 18 hours in our case). The server sends a Change Cipher Spec message, indicating that no more plaintext messages will be sent by the server. Lastly, the server sends its first encrypted message, an encrypted handshake, just like the client.
  • That’s it. We have established a secure connection to a computer on the other side of the planet and verified its identity. Magic!

Crypto Primitives

Let’s discuss what goal of cryptography is achieved by what part of this entire handshake. Remember the cipher suite that the server choose? It was TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256.

ECDHE: ephemeral Elliptical Curve Diffie Hellman, as we saw, is used to establish a shared secret session key from the random values our client and the server exchanged (over an insecure channel). It is a key exchange crypto.

ECDSA: Elliptical Curve Digital Signature Algorithm is used to verify the public key supplied by the server, in our case nagekar.com, issued for Cloudflare by COMODO.

AES 128 with GCM: AES is a block cipher. Being a symmetric key encryption algorithm, it is much faster than the asymmetric key ones, and hence used for encryption of all the data after the initial handshake is done. 128 is the size of the key in bits, which is sufficiently secure. GCM stands for Galois/Counter Mode, which is an encryption mode used with AES to provide authentication and integrity along with the regular confidentiality.

SHA256: Since we’re using AES with GCM, we won’t be using this hash function for message authentication. However, since TLS 1.2, SHA256 is used as a PRF. It will also be used to verify that all content exchanged during the handshake were not tampered with.

Security Considerations

About trust: As you might have noticed, all the above steps were essentially so that two random computers can come up with a shared secret session key. The other part to this is Certificate Authorities. Why did we trust the certificate that the server sent us? Because it was signed by someone, whom we trusted. At the end of it all, you still have to implicitly trust someone to verify the identity. In this case, we’re trusting COMODO to have only signed one certificate for the domain in question.

About browser and updates: If you see the version of TLS that we used, it is 1.2 which is not the latest. The cipher suite is also not the best we could’ve got. Why did that happen? Simple, because we were using an outdated browser which didn’t support the strongest cipher suites and the latest version of TLS. Since that was a test machine, it doesn’t matter a lot. On any up to date browser, this is what you should see.

About cryptographic primitives: We saw some of the most understood crypto primitives being used in the handshake. This is a pattern you’ll see often while reading about cryptology. It is a sin to implement your own crypto, especially the primitives. Use a library that implements these primitives, or better yet, the entire cryptosystem.

About mathematics: The reason that we think the above scheme is secure, that no data is leaked even though key was generated using the information sent in clear, is because the basis of some of these cryptographic primitives are hard problems in mathematics. For example, since mathematicians believe that discrete logarithms are easy to verify but are hard to calculate the other way, we say that Diffie Hellman (which makes use of discrete logarithms) is secure. Similarly with RSA, mathematicians believe that factoring large prime numbers is a hard problem, hence RSA is considered secure as long as the numbers are large enough. Of course, not always is a mathematical proof available. For example, AES is considered secure, but there’s not proof that it is secure. We think it must be secure because the brightest minds in cryptology have spent thousands of man hours trying to break the encryption algorithm but they haven’t succeeded (yet?).

In Closing

As you can guess, a lot of important details are skipped in this article. There are two reasons for that. 1. I lack the necessary knowledge to simplify the deeper parts and 2. It would be boring to read if the post felt like a spec. If you wish to read more, refer to the list of references below this section.

References

Thank you for reading!

ELI5 – Deterministic Encryption

Suppose you have a database full of confidential information such as emails of users. As a responsible sysadmin, you’d not let such data exist in plaintext in your systems and therefore you decide to encrypt everything. But now, the application needs a searching functionality where users can see their emails in the system.

Essentially, you need to run a where email = ‘ ‘ query on the encrypted database, get all the rows that match, decrypt them and send the decrypted data to the application layer. With traditional encryption modes like CBC or modern authenticated encryption modes like GCM, this would be impossible (or extremely inefficient). This is where deterministic encryption comes into the picture.

Deterministic Encryption

Deterministic encryption is nothing fancy. Instead of using modes like CBC and CTR where each block of the ciphertext is dependent on the previous block or the message counter, in deterministic encryption, data can be imagined to be encrypted in EBC mode or the IV kept constant. No nonce is involved. Basically, a plaintext message M will always map to the same ciphertext C in a given deterministic encryption scheme under a particular key K.

Once the data is encrypted into ciphertext, it is sorted and stored. Now, when a search term comes up, it is encrypted using the same key, then the database is queried for this ciphertext which returns all the rows that match. The application can then decrypt these rows with the given key. The search takes logarithmic time (since for the database, this is a normal text search) and the database never sees any data in plaintext.

Even with all of this, deterministic encryption faces all the issues that plague encryption modes like EBC. Namely, if two plaintexts are same, their encrypted ciphertexts would be equivalent as well, thus leaking information to an adversary. Formally, we say that deterministic encryption can never be secure under chosen ciphertext attack. Although that doesn’t diminish its value when we have to deal with searchable encrypted datasets.


From Wikipedia’s encryption modes of block cipher page

This means that deterministic encryption cannot (or rather should not) be used when the message space M is small. It can only be used on unique values such as email address or usernames which are designed to be unique in a database.

Further reading:
https://crypto.stackexchange.com/questions/6755/security-of-deterministic-encryption-scheme
https://en.wikipedia.org/wiki/Deterministic_encryption

Thank you for reading.

ELI5 – Format Preserving Encryption

Most block ciphers work on the bytes and bits of data. It doesn’t matter to them if the data is a video, a phone number, a credit card number or a meme of Patrick Star. And that’s good. It means that a generic block cipher can handle almost all the traffic encryption over TLS while we’re busy browsing Reddit with YouTube playing music in another tab. And since the underlying network handles binary data just as well and efficiently, no one complains.

And it is generally a bad practice to write domain-specific encryption algorithms. That is the reason you’ve never heard of AES Image Encryptors. But sometimes, for very specific use cases, it becomes necessary to have something like that. Something like an AES for images, so to speak.

The Problem

What if a legacy application needs to integrate encryption of some of its data but not change any of the existing data structures? AES works on 128-bit blocks of data, DES on 64, so it won’t work for phone numbers or credit card numbers. At least, not without changing the underlying data structures required to store the encrypted data. And suppose we cannot change the underlying architecture for various reasons, one of them simply because we do not control some of the machines passing our data. Yes, that’s where we need format preserving encryption (FPE).

Format Preserving Encryption

While researching about this topic, I came across this beautiful construct by cryptographers John Black and Phillip Rogaway. The construct is simple and the best part is that it uses a block cipher as a pseudo-random function (in case of block ciphers with larger block sizes, we truncate their outputs to the desired bit size by taking only the N least significant bits), thus inheriting all the goodies of the underlying block cipher. Let’s look at a brief working of this method.

Let the message space be M. In case of phone numbers, that’s from 0 to 9,999,999,999 (that’s for India, and while the actual message space is much smaller than that, no harm in assuming for the entire range). The number of bits required to store this information is ln(10^10) = ~24. So we can fit the ciphertext in 24 bits assuming no padding or integrity checks. Now imagine two sets, X and Y. Let X be a superset of Y. In this construct, X represents the set of all possible ciphertexts that we can get on encrypting each Mi with our block cipher. Y represents the set of allowed ciphertext, that is, ciphertexts that are equal to or less than the max value of our message space M (which is 9999999999 in our example).

Now, when you encrypt a phone number with a block cipher, there’s a good probability that the value would be less than or equal to our phone number block size (10 digits or 24 bits, assuming we’re truncating AES output to 24 bits as well). If that’s the case, that’s our answer. If not, encrypt this ciphertext and check again. Continue this until you reach a number that can fit in 10 integer digits.

Now while some of you might think (I certainly did) this would result in a long loop, it would not (with high probability). This solution not only works but works efficiently (on an average, the answer will be found in 2 iterations with 50% plus probability of finding it in each iteration). That’s pretty cool if you’d ask me!

In an ideal world, you’d want to rewrite the logic and underlying data structures such that native AES is possible. In this one, format-preserving encryption would work just fine. Thank you for reading.

ELI5 – Key Derivation Function

We’ve heard that AES and other block ciphers require specific key sizes; 128, 256 and 512 bits. But I don’t ever remember having to calculate my password length based on the underlying key size. Never have I read on a website “passwords need to be of 16 ASCII characters, 1 byte each, to make a total of 128 bits of key material”. So what lies between me entering an arbitrarily sized password and the encryption algorithm receiving a 128/256 bit nicely sized key. Let’s find that out in this ELI5.

Key Derivation Function

A Key Derivation Function (wait for it…) derives cryptographic key(s) from a password. Generally speaking, the passwords we humans come up with are something like “MyAwesomeDog007” which, while long and easy to remember, just don’t have enough entropy for cryptographic applications. On the other hand, a key derived from a simple password “ml6xU*dwGS5rvE!dcIg6509w$$” (that’s not a real key, a real key would in most cases be binary) is complex and entropy rich. This is the first purpose a KDF serves; to increase the entropy of a password and making it suitable for use in other algorithms such as AES.

The second purpose that KDFs serve is that they make brute forcing infeasible. Due to the high computational costs of running a good KDF, brute forcing is typically not achievable for any half decent password. Of course, it won’t protect a user from a dictionary attack if she selects a password such as “password123”.

Working

A KDF takes an arbitrarily sized input that has low entropy (user-supplied password, for example), runs some hash-based algorithms on it, and output a random looking fixed sized cryptographic key (which becomes input key to encryption and MACing algorithms later). A KDF can be thought of as a pseudo-random function (PRF) which maps an input password to an output key. As a PRF, the input and output mappings should look completely random to an attacker and in no circumstance should he be able to get the original password from a cryptographic key (that is, the function should be one way). The high iteration count makes computing KDF an expensive affair. This is acceptable for a legitimate user but will prevent brute forcing of the password.

Typically, key derivation functions employ keyed hash algorithms or HMAC. Cryptographic salt is used to prevent rainbow table attacks (precomputed hash lookups). The number of iterations (in the order of tens to hundreds of thousands) of the hash function is selected to slow down bruteforce attacks.

Implementations

A simple key derivation function is Password Based Key Derivation Function 2, PBKDF2. It takes as input a pseudo-random function (such a SHA-256), user supplied key, salt (64+ bits), number of iterations, length of output key, and outputs a key of specified length.

Although PBKDF2 is still used and recommended, modern alternatives such as Scrypt and Argon2 offer much better resistance to bruteforce.