Monthly Archives: August 2017

Private Cloud Part 2 | Encrypted Storage With NextCloud

New cloud setup. YAAY! Self hosted, encrypted and scalable. Plus comes with a nice web interface, native Linux and Android clients and its very own app store. I’ll first write about the setup itself, and then some of my personal thoughts over the entire private cloud exercise.

Features Overview

The major components of the setup include the following

  • NextCloud 11 on Ubuntu using Digital Ocean’s one click installer on a 5 USD cloud vps
  • Digital Ocean’s flexible block storage
  • Let’s Encrypt for free TLS
  • NextCloud sync client for Arch and Android on desktop and phone respectively for data sync
  • DavDroid for contacts and calender sync on Android (uses WebDAV)
  • Optional redundant backup and client side encryption using GnuPG (see below)

Pros Vs Cons

So I now have a proper private cloud, self hosted, synced across mobile and desktop (including contacts, messages and calender), optional client-side encryption and scalable (♥DigitalOcean♥). What’s amazing is that I never had a native Google Drive client on desktop, but now I have a native NextCloud client, and it just works. And yes, it isn’t all sunshine and rainbow. There are some serious trade-offs which I should mention at this point, to make this fair.

  • No Google Peering, hence backing up media is going to be a struggle on slow connections
  • Google’s cloud is without a doubt more securely managed and reliable than my vps.
  • Integration with Android is not as seamless as it was with Google apps, sync is almost always delayed (By 10 minutes. Yes, I’m an impatient (read ‘spoiled’) Google user)
  • Server maintenance is now my responsibility. Not a huge deal, but just something to keep in mind

Having said that, most of it is just a matter of getting familiar with the new set of tools in the arsenal. I’ve tried to keep most things minimal. Using few widely adopted technologies and keeping them regularly updated, sticking to the best practices and disabling any unwanted, potentially dangerous defaults and with that the server is secure from most adversaries. Let’s first define what “secure” means in the current context using a threat model.

Threat Model

The only thing worse than no security, is a false sense of security

Instead of securing everything in an ad hoc fashion, I’m using this explicitly defined threat model, which will help me prioritize what assets to secure and the degree of security, and more importantly, what threats I’m NOT secure against.

  • Compromised end device (Laptop): Since data is present unencrypted on my end, an adversary having access to my computer via say a ssh backdoor can easily get access to all of my (unencrypted) data. Private keys cannot be compromised as they are password protected. A keylogger might be able to sniff out my password which can then be used to decrypt any encrypted data.
  • Compromised end device (Mobile phone): Since data cannot be decrypted on the mobile, all encrypted data would remain secure. Only the unencrypted files will get compromised. However, if an adversary gets access to my unlocked cell phone, securing cloud data would be the least of my worries.
  • Man In The Middle (MITM): As long as Let’s Encrypt does it’s job, TLS used should be enough to secure the data against most adversaries eavesdropping on my network. It would not protect me if Let’s Encrypt (or any other CA) gets compromised and an adversary makes duplicate certificates against my domain and uses it to eavesdrop the traffic, the possibility of which is rare.
  • Server Compromise: If the server is compromised through any server side vulnerability (assume root access) and an attacker gets access to everything on the server, all unencrypted files are compromised, which would include contacts/calender lists. Since the decryption key is never transmitted to the server, encrypted files won’t be compromised.

Why Client Side Encryption

The entire exercise would look pretty pointless if I just took all my data from G Drive and pushed it to NextCloud. And from the previous cloud server attempt, I know how uncomfortable it is to have your data accessible from the network all the time. Those reasons were more than enough for me to go for an encrypted cloud solution. Although it would still look pointless if you were to ask me why didn’t I just encrypt the data and upload it to G Drive again. The answer is simply because I didn’t want to.

After some research (being a novice with security, that was a must), I came up with a list of guidelines that I had to write my solution on.

  • Use of symmetric key cryptography for file encryption, particularly AES-128
  • Memorizing the AES key or using public key cryptography to store the key of file en/decryption on disk. (Not sure which is the proper way of doing it, although I’ve asked the experts for help)

Encryption

There are a lot of tools one can use for data encryption. I used Gnu’s Privacy Guard (GnuPG or simply GPG). It is anything but easy to use. But the nice part is that it just works, is extensively reviewed by experts and has been around since I was 4 years old. So in theory,

  • Generate a public/private key pair in GPG
  • Generate a strong passphrase for the encryption, and encrypt it using the public key you just generated. Store it locally someplace secure
  • Get a list of all files and directories from a specific folder using find (for one time backups), or use rsync with a local sync copy (for incremental backups)
  • Iterate the list (of all or changed files). If item is a directory, create that directory, if item is a file, encrypt the file and push it to that directory.
  • After encryption, you’re left with either two or three directories, /original-dir, /remote-encrypted and optionally, /local-unencrypted-sync
  • The additional (local sync) directory is useful when incremental backups are required and rsync uses this directory to keep track of changes, and only (re)encrypts those files that have been added/changed since last sync. Useful to setup a cron job. At this point, you can delete the files in your /original-dir safely
  • Decryption is just the opposite of this. You supply the location of your /remote-encrypted directory and the script generates a new directory with unencrypted content.


Original directory


Encrypted backup directory

This does the job for now. Here’s the script that I’m currently using. I wanted to enable sync without the need for a helper directory, just like Git does (it stores the changes in the same directory in a .git/ directory). Will update it if I manage to get that done.

In Closing

Eighteen months ago, I wrote on how to create a ‘cloud’ storage solution with the Raspberry Pi and half a terabyte hard disk that I had with me. Although it worked well (now that I think about it, it wasn’t really a cloud. Just storage attached to a computer accessible over the network. Wait, isn’t that a cloud? Damn these terms.), I was reluctant to keep my primary backup disk connected to the network all the time, powered by the tiny Pi, and hence I didn’t use it as much I had expected. So what I did then was what any sane person would’ve anyway done in the first place, connect the disk with a usb cable to the computer for file transfers and backups.

Earlier this year, I switched ISPs and got this new thing called Google Peering, which enabled me to efficiently backup all my data to the real ‘cloud’ (Google Drive). That worked, and it was effortless and maintenance free. And although Google doesn’t have a native Linux client yet, the web client was good enough for most things.

And that was the hardest thing to let go. Sync and automatic backups were, for me, the most useful feature of having Google around. And while everything else was easy to replace, the convenience of Drive is something that I’m still looking for in other open source solutions, something I even mentioned in my previous post on privacy.

So although I now have this good enough cloud solution, it definitely isn’t for everyone. The logical solution for most people (and me) would be to encrypt the data and back it up to Google Drive, Dropbox or others. I haven’t tried, but Mega.nz gives 50GB of free tier end to end encrypted storage. Ultimately, it makes much more sense to use a third party provider than doing it all yourself, but then again, where’s the fun in that! Thank you for reading.

Privacy – How I Converted

In spite of my inclination towards cyber security from an early age (relative to when I ‘discovered’ the Internet), I never was a big fan of privacy over the web. I knew some bits here and there about it, like how my data is used to serve me targeted content, how tracking happens even after I close the browser tabs and how companies watch me visiting sites and track my habits. Heck, I found it fascinating that I saw adverts from third party companies about the products that I was currently researching about. Internet, to me, was like a close friend who knows everything about you, your habits and interests, your lifestyle and more. And when I say friend, it isn’t metaphorical. I literally trusted the web for every bit of work-life thing that I got involved into. I liked that my email was always synced, that Google asked if I wanted to review the place I was at, that all my photos were automatically backed up to the cloud, that I got a ‘This day 3 years ago’ notifications every once in a while, that I received personalized notifications about the bills that were unpaid and the events that were due, like magic!

And all these years, I’ve heard about numerous leaks, activists exposing unethical government secrets and mass surveillance and I was always disconnected from it. When Airtel & BSNL were injecting adverts into my web pages, I was okay with it. When Google or Whatsapp changed their privacy policies, I readily accepted the new ones, after all I’m sure they value their users, and decide in their best interests, right? After all, what do I have to hide?

Now, I consider myself a huge fan of free and open source software, and in the open source world, you readily trust the software or content, not because you personally trust the people behind it, but because the code is subject to scrutiny by fellow community members and as a result, the chances of using an open source software that is a malware or a back-doored Trojan is essentially zero (such attempts are readily caught. (meta meta: is this a survivor bias?)). I remember the heavy criticism of Ubuntu for logging the search keywords of it’s users for serving them targeted ads which eventually led to elite members of the open source community advising against using Ubuntu and RMS calling it a ‘spyware’. But what Ubuntu did is only tiny bit as harmful (they did put an option to opt out of this ‘shopping lens’, or uninstall it altogether) as some of the tools and services we use everyday. And that is what I realized in the past month.

From here, it is about how I turned 180 degrees and started to care about privacy and anonymity more than ever, how I became paranoid about the data that I publish online and think twice before registering for an online service, or visiting untrusted websites without a VPN. If you feel this is of no interest to you, I urge you to close this tab after watching the following video. The message is very powerful and I’d like you to give yourself sufficient exposure to the problem before deciding if you want to care. You may continue reading if you would like to learn about my decision and what led me to it.

Let’s start with the most obvious question…

Why now?

The anticlimactic answer is, better late than never. This article isn’t the result of a single blog post that I read or any specific incident. It is a cumulative result of the critical exposure I’ve had in the past month or two, and a subconscious exposure of the past few years. I had this on my mind from some time, but laziness is what I’d call it. Who wants to give away the convenience of synced devices and automatic backups! I’m fortunate enough to have a paranoid friend around who doesn’t use many (any?) social networking sites and online services. All he has is probably a ProtonMail email address, and he’s just as active on the Internet as I am. I always considered his view of privacy a personal preference, a subjective view of the world, not an objective truth about the Internet and companies based on Internet. But recently, the more exposure I’m getting about the way Internet giants collect and use my information, government surveillance etc, the more I’m moving away from using their services. It isn’t about if someone is watching me while I use the Internet, which no one probably is, given my uninteresting Internet activities. It is the possibility that at any given moment someone/something could watch me, without my consent, store tonnes of meta data about me for use 15 years from now, and I might lose the basic right to privacy that I always took for granted, is what makes me uncomfortable.

However I don’t expect anything to change when I make a switch. In most cases, nothing would change for me, as an individual who accesses and relies on the Internet everyday. Free and open source alternatives exist and it is a matter of hours (if not days) to make a complete switch from proprietary to open source software. But now, I’m leaving a lot less footprints in random server logs and by using open source whenever possible, I can narrow down the number of malwares and spywares I carry around with me in my phone or laptop. And something I really need to emphasis on, a spyware is not necessarily installed by just a third party malicious user. OEMs ship spywares all the time (tampering preinstalled TLS certificates and performing MITM attack to show ads, now that’s dark). All this is without even mentioning the humongous quantities of crapware these OEMs ship their products with, widening the attack surface for a third party adversary. All of this can be mitigated if you control what’s installed on your devices and choose what services to use.

If you want to know more about what sort of threats to privacy exist around you, you might want to check out this amazing course by Nathan House titled ‘The Cyber Security Course – Hackers Exposed’. Don’t get intimidated by the title, it is for anyone who wishes to understand the threat landscape so that he can take the necessary steps to ensure adequate security and privacy according to his needs. Nathan does a great job at putting the key points in front for you to decide rather than feeding you his opinions. Highly recommend his course.

Is the threat real?

This question arises in the minds of people when they hear about issues like Privacy and Global Warming (I was surprised to find a good number people think Global Warming isn’t real). Is this real? Or is it one of those hyped-stories that would fade away and everything will get back to normal once media stops covering it. Let me start by confessing that it was in the last month that I read the terms of service of any company I used online for the first time, and boy I was surprised. I agree that reading ToS is boring, but it really is critical to ensure a peace of mind when you use a service. If you’re still not sold, check out this amazing site called tosdr.org or Terms-of-service-didn’t-read which summarizes the ToS of popular services and rates them from class A (good policy) to class E (bad policy) and the key reasons supporting the rating. The data is a bit outdated, but you do get a general sense of the corporation’s privacy structure. And to be honest, you don’t need any of this. All you need to do is keep your eyes and ears open and assess the data you’re about to give to the next application you download from the market. Take Whatapp’s ToS for example, the service which promises that the messages are end to end encrypted with Signal Protocol. Sure, they are. And there’s no doubt in my mind that Whatsapp is one of the most secure messengers we have with us today. But privacy and security are two very different topics to discuss, both equally important (a good read here). And when it comes to privacy, it is not our messages or content that companies usually target. It is our meta-data. Here’s what Snowden tweeted about it.

Are your readers having trouble understanding the term "metadata"? Replace it with "activity records." That's what they are. #clarity

— Edward Snowden (@Snowden) November 2, 2015

Now there are a lot of articles on this topic and hence I don’t plan to get into it. To quote a key point from one of the articles about what meta data really is,

  • They know you rang a phone sex service at 2:24 am and spoke for 18 minutes. But they don’t know what you talked about.
  • They know you called the suicide prevention hotline from the Golden Gate Bridge. But the topic of the call remains a secret.
  • They know you spoke with an HIV testing service, then your doctor, then your health insurance company in the same hour. But they don’t know what was discussed.

This example should not imply Whatsapp is the only or worst offender, it is just one of them that I’m familiar with (and use personally). I assume that you can read and decide for yourself.

Why did I care?

“But Government Surveillance is a US problem!”, “My nation doesn’t come under 5, 9 or 14 eyes. Sure, but don’t we all use the same Internet? I don’t need to emphasis about the importance of Internet in matters of freedom of speech, how it has nullified international borders to connect people of similar interests and how many revolutions it has started. I don’t need to mention that Internet is a second home to political (h)activists and dissidents, a place where they can express themselves to the masses. I certainly don’t need to mention what I personally feel about the Internet which you probably know by now. And I’m not even getting started on how, even if you don’t belong to any of these ‘eye’ nations, most of your traffic is still getting routed through them.

The way I think about it is, a full disclosure sometimes becomes necessary to bring about a change in the system that is broken and is resisting a fix. This was one of the highlights in Keren Elazari’s TED talk on ‘Hackers: Internet’s immune system’

Precautionary measures I took

I adopted some defensive measures to try out this new Internet lifestyle, applying the learnings from the past couple of months, since it wouldn’t make sense to not do it after this exposure. This is quite experimental, so try out what works for you, the way I’m doing. A word of caution though. This list would not cover you from threats and privacy breaches from third party adversaries like cyber criminals, who might choose targets more specifically (like sending a malware via email or infecting your local network). The best (and in many cases, only) defense against it is to keep your systems (laptops, mobile phones) up to date with the latest security patches. Did I make it sound important enough? KEEPING YOUR SYSTEM SOFTWARE UP TO DATE AND PATCHED IS THE BEST THING YOU CAN DO TO STAY SECURE. Sorry for screaming. Okay, now back to the measures I took.

  1. Flashed LineageOS on my Phone – Almost stock Android, plus more control over what I install (note: rooting, flashing, installing from unknown sources etc potentially opens a huge security hole in itself)
  2. No Google Play Services – The suite of Google apps such as Gmail, Youtube, Docs and Drive are optional, and I chose to not install them
  3. Gave up my G Suite subscription. So no synced devices and automatic photo backups. (Remember to ‘Takeout’ data before leaving)
  4. Turned off port forwarding, DMZ, UPnP and any other service on my router that might expose any of my internal devices to the Internet
  5. K-9 Mail as email client
  6. SkyTube for read-only tracking-free Youtube
  7. f-droid for free and open source Android apps, also there are plenty of closed source apk repositories that don’t require a Google account.
  8. DuckDuckGo as the default search engine across all devices
  9. Debian or Arch linux on desktop, as recommended by Nathan House, provides a good mix of active development, security, support and speed, although you can pretty much choose any good distro depending on your taste and harden it.
  10. Signal Messenger on Android/iOS for Whatsapp like security and usability minus the meta data issues
  11. Firefox Focus as the primary browser on phone, except when explicitly wanting to store history, in which case Mozilla Firefox
  12. Mozilla Firefox on desktop, and Chromium as the secondary browser. Google Chrome is a better browser imho, for it supports a lot more content types than Chromium does out of the box. Not to mention better updates and security. Boils down to your personal preference, really
  13. Deluge for torrents
  14. LibreOffice for document/presentation editing needs
  15. VLC for pretty much everything multimedia
  16. And the rest of the goodies you get with any nice distro. (Must admit that I haven’t found a Google Drive alternative yet)
  17. Lastly (and optionally), encrypted mail providers like ProtonMail for secure email and a good VPN (such as Mullvad, or Tor for that matter, but make sure you read the differences) for use when on public Wifi hotspots

Needless to say, that is what I’m using right now, and kind of recommend. Except for the couple of options on the top, I’m sure most of you are familiar with (and probably use) the rest. If yes, that awesome. That’s a win for the free and open source community. And I’m not affiliated to any of those! Haha

Is any of this necessary?

“You are overdoing it!” as my friend exclaimed. I totally agree, and to be really honest, it is not just about privacy at this point, it is about enjoying the new world that I’ve found, exploring the corners and trying to fit in. I believe that open source shouldn’t feel like a compromise. It should be a pleasant experience for everyone who uses it, whether or not they consciously care about it being free and open. I am sure not everyone is so willing to give away convenience for the sake of some principles and ‘freedom of the web’, and that is totally fine. As long as you take the decision of giving away your data and are okay with it for the rewards it comes with, and not let a corporation decide it for you, I’m no one to tell otherwise. I’m here to tell you that there’s a world out there that represents the open and free nature of the Internet, and it is not at all difficult to convert. I did, and so can anyone.

Links from the post

Aggregating all the blog / additional information post links with their titles from the above text here.

Thank you for reading.

A Programmer Or A Problem Solver?

Normally when we think of programming, we think of problem solving. Similarly, if someone works in the field of computers and say they love problem solving, we immediately assume that they work with some computer code. Programming is almost synonymous with problem solving, in that it involves breaking down complex looking problems into simple mini problems that can be easily taken care of. Where they might differ, in my opinion, is finding the right problems to solve.

The inspiration for this post came from a recent blog post that I read (embarrassed to admit I’m not able to find that post found it!) about the mistakes developers make. One of the mistakes was confusing between the love for problem solving and programming. That was a little “ahaa!” moment for me. It gave me a moment to reflect on my own likes and interests. What is it that excites me? Is it the idea of building the next big thing? Maybe. Is it spending countless hours writing code that does what has been done a million times before, just so that you can fall in love with your code all over again? Yes, that’s sounds about right.

I liked to call myself a problem solver, but I’m not even close to being one. I didn’t feel like there was a distinction. But there definitely is, now that I’ve met some people in my field who are ‘problem solvers’ first. I don’t have Github projects that reflect a problem solver. What you’d rather find are spot on examples of reinventing the wheel (a dozen chat/social network networks), attempts to write the most beautiful code that I can (regardless of whether it works or not), over-engineering to say the least, projects made entirely for trying out new languages, new frameworks, new IDEs, literally. I’m somewhat embarrassed to admit that the current project I’m working on is a chat backend as well.

You get the pattern. It isn’t hard to understand that I love programming. I love writing code regardless of the problem in hand. I see people building things that are changing the world, the way we live, the way we communicate, the way we travel and I appreciate them all. We need people like that. They are on the frontiers of the information age that programmers like me and many of you are riding on. It is here we start to see the difference between someone who is a programmer first and someone who is a problem solver first. I believe it is a matter of preference, experience, and the level and kind of exposure one has in the budding years in tech.

I had a couple of “tech friends” right from my junior college some 4 years ago, people with whom I could discuss programming and tech in general. Most of my collaborated projects were with these people, and we always worked on something because we enjoyed it, purely out of our “passion” for computers. We spoke about new languages, technological advancements and people in tech in our free time. We never thought any of this would help us get a good job or any project would appear on our CVs. That is one reason why technologies and frameworks are scattered all over my blog, instead of quality projects that people actually use in one particular technology that I could’ve been good at.

So that’s what I’ve learnt recently, and wished to share here. I (or anyone) won’t know which way is the right way to go, or if there is one, and it doesn’t really matter much as long as you enjoy what you’re doing and make a decent sum of money doing it. I admire people who are passionate about programming as well as problem solving, and the world would really be incomplete without either one. Thank you for reading.