Customer Data is the Customer’s (private) Data — It should not be open for all — Part 1 — Tatasky

Discovery by: Rahil Bhansali & Ankit Pandey

Disclaimer: What you’ll read below and in the series of blog posts are simple but massive security vulnerabilities that exist in some of the most popular brands. My approach in each case has always been (& will continue to be) to understand the extent of the loophole, exhaust every connect to try and bring it to the company’s notice, have them fix it and then write about it so consumers and companies alike can focus on improving their defences in protecting consumer data, privacy and security. In cases where I’ve not been able to reach the company, the post serves as a last ditch effort to alert the company so that they can fix the issue. Please do note — we have no particular affinity to any brand of companies, and some of the brands mentioned in the series — we are subscribers / customers of and admire the leadership and what they do for our country as well.

Part 1: Tatasky

Summary of Finding: Tatasky on https://www.tatasky.com/(thanks to basic coding mistakes) inadvertently exposed its customers’ data including:

  • Name
  • Gender
  • DOB
  • Email
  • Registered Mobile Number
  • Alternate Phone Number
  • Address
  • Subscriber ID
  • Subscription Balance
  • Subscription Start Date
  • Subscription End Date
  • Transaction History since first subscription
  • Service Requests since first subscriber complaint
  • # of Boxes — active & inactive

The data accessible included names, phone numbers & addresses of celebrities, popular business people, doctors, among others.

Scale: The above data for 22 million+ subscribers was accessible by any one who knows how to hit an API (any developer to be honest). The data below was not only accessible for their active subscribers but even for subscribers who have been inactive since a few years (so in my opinion — far higher than 22 million subscribers).

Possible Malicious Uses of Data: Data acquired from such vulnerabilities could theoretically be used for a lot of different malicious plays — some posing risks to consumer privacy, security while others that could be a business risk to the company itself. Here are some ways:

  • Sale / Use of Sensitive Information — Equipped with a large database of Personally Identifiable Information (PII), an attacker could use the data to pretend to be the user across various corporate call centers in India (The company had a lot of users across major metros at one point in time).
  • Consumer Targeting by Competitors — Any DTH or Telecom company wanting to target subscribers for their offering could potentially target subscribers who’ve recently deactivated their membership or spend a lot with the company. They could also understand the top paying customers and target them (legally or otherwise by writing scrapers).
  • Phishing and Credit Card Details — Given the sensitivity of the information, KYC update or Subscription Renewal phishing emails/SMS could be easily devised which would look realistic and sent to the customer. A subscriber could be taken to a look-alike quick recharge page and requested to enter their credit card details there by helping the attacker get even more information on the subscriber and cause a financial loss. Example of possible phishing campaigns that I could have theoretically run if I had to think of malicious use:
1. Hi {{ name }}, your Tatasky balance for subscriberId: {{ subscriberId }} is under Rs. 100. Please click <a href="www.google.com">here</a> to renew to avoid any service interruptions2. Hi {{ name }}, switch to XYZFiber and get a 4K setup box for FREE. Please click <a href="www.google.com">here</a> to setup a new connection3. Hi {{ name }}, your KYC needs to be updated. Is this still a valid contact number: {{ mobile number }} and address: {{ address}}? Please click <a href="www.google.com">here</a> to update.

Backstory: On December 28/29th, I decided to recharge our Tatasky account (which had been lying inactive since a few months). Basis my conversation with customer care, I went to the website and added my phone number to do a quick recharge. To my surprise, it showed me my name, subscriber id, balance and subscription end date without even any form of login. I finished my recharge and was about to sleep when something seemed to bother me about my interaction with the website.

I opened my laptop and tried the same flow again for a friend’s number and a family member’s number. To my surprise, I could see the same details for their accounts as well (which btw — is still easily accessible by anyone who uses the website — in my opinion it needs to be behind a login mechanism and I’ve already communicated the same to the company).

Discovery (Technical Explanation): With my tech capabilities, I was able to see under the hood and realised that the APIs (what gets the data from a database) being called were open and not behind any sort of authentication. Basis this discovery, I tried to see if other apis were open (lacking authentication).

An initial probe showed that I had to enter an OTP to access pages like transaction / recharge history & my profile. The api which gave me basic subscriber details had masked the phone number and email address. I felt comforted that not a lot of exposed (but it lasted only a few minutes).

When I saw the API structure for the my profile apis, transaction history apis, service requests, etc. I realised that the header accepted an access token (which is a form of authenticating which user is accessing the data). However, the developer who wrote this api — decided to ignore such a token and chose to pass the mobile number as a parameter to the API. What that (assuming unknowingly — since its an amateur developer mistake) did was that I could now pass any mobile number without logging in as the user and access the user’s unmasked address, email, recharge history, # of boxes & service requests.

Replicating at Scale (Technical Explanation): In under 2 hours, I was able to write a simple script and build a webpage that on inputting any phone number would return all the open data in a table format. This was important so that I could bring the extent of the vulnerability to the company’s notice without much back and forth (this after-all was an probono discovery). This also worked for random subscriber IDs that were reported on LinkedIn complaints and twitter complaints to the company’s customer care team.

Now the question was — can you get this data at scale for the entire subscriber base without having to input one mobile number at a time? After all, a sophisticated attacker would have to do that to really make use of the information.

To solve the problem of which mobile number to pass, we used a wikipedia article to understand mobile number formats in India. We also used the oldest series like 9820, 9821, etc. from Vodafone and in the Mumbai circle so that we could quickly test our script out. The assumption was, if we take any 4 digit series and sequentially went through the last 6 digits (000000 to 999999), we’d have a million hits on the open APIs and be able to pull data for whichever number returned a match.

But to do this at scale so that the process would not take time, my dear friend and ex-colleague, Ankit Pandey, came in and helped me write code to run the script on multiple threads and write it to a CSV. With Ankit’s help, in another hour or two, we now had a script that we could theoretically run for all the mobile numbers in India and see if the user was a subscriber. On new years eve, December 31st, 2020, we tried this with a few 4 digit series just to check if our script was working and our assumption was correct.

To our surprise (rather sad it actually worked) — we were able to pull data for random mobile numbers. Additionally, without any API rate limiting, you could keep hitting the APIs (infinite times) in a few hours and it would return data. If there was a 403 error, just changing the IP solved the issue (again disheartening that it was so simple). With that understanding, it was time we reached out to Tatasky so that this could be fixed.

Note: We have not pulled the entire database because we have no use of it, just tested whether it was technically possible. My intention is and never will be to trespass, just understand what someone with malicious intention can possibly do. I do not understand the legality of it and my intent has and will always be to have the issue patched by the company!

Company Interaction & Fix: Between December 30/31st, 2020 and January 4th, 2020, I tried various ways to reach the company. My first obvious choice was to find out company leadership names and connect with them on LinkedIn. I tried connecting with the CEO, CIO & another senior executive.

To my happiness, Harit Nagpal (the CEO) was very quick in accepting the request. However, I couldn’t get his work email to have confidence in sending such sensitive findings. So I reached out to a mutual connect (leaving his name anonymous since I haven’t asked him but I am extremely thankful since his introduction helped fix the issue in a timely manner) who introduced Harit and me on email.

We sent Harit and his team the findings on email (with Ankit in cc). A few hours later, the authentication was quickly working on the APIs and most of the issues were resolved. Basic rate limiting was also added (however changing the IP would circumvent it). Thankfully the Tatasky team was prompt in the fixes and ensured all the authorised APIs were behind the authorisation check.

However, one issue still remains — where the subscribers name is still accessible for any mobile number — the communication I’ve received is that its for the user experience to be better. My take — I’d sincerely request the leadership to re-consider, because it allows sophisticated attackers the ability to build a database of verified mobile numbers to use in phishing scams — like 1 & 2 in the phishing examples above (more and more common in India today). I’ve spent time in checking other providers as well like Jio, Vodafone, Airtel — and they’ve all prevented from implementing such user experiences presumably because of similar security risks as detailed in the article. If you are a reader or subscriber — whats your take on this flow?

Recommendations for Interaction with Security Researchers / Developers: As developers ourselves, we understand that issues exist in any tech product — but we do hope issues when discovered are handled through dialogue. A few suggestions for companies:

  1. Do engage with people who are trying their best to ethically bring a vulnerability to your notice. It may also help you control the rhetoric of the finding when its in the public domain.
  2. Do publish on your website a non customer-care email that can be used to report such issues / train customer-care to escalate these. In my next post — you’ll realise that customer care just failed to understand and escalate a massive issue.
  3. It doesn’t hurt to have a bug bounty program or enlist on websites like hackerone.com so that these can be discovered quickly.
  4. Do an audit & check the logs to see when this issue started (my guess is its since the initial codebase) and how many unauthorised people have accessed the data in the last few years. Would seriously suggest companies to invest in anomaly detection.

Recommendations for Developers & General Basic API Security Tips (please research further and ensure best-in-class security): Below are some recommendations for ensuring security for your digital products:

  1. Open APIs are dangerous. Ensure APIs are always behind a bearer token when accessing any sort of consumer data. Middleware for authentication can go a long way in ensuring any authorised api is always checked. If there is an issue, it can be fixed right there. Having a user session service that interacts only with such middleware to return data can also help.
  2. Never pass user-sensitive data as parameters to fetch information unless its a login / registration / profile update form: Always trust the session service or the token to get details of the authorised user. Sending parameters like mobile number, always leaves yourself open for attacks.
  3. Invest in API testing & Monitoring: Ensure APIs are tested individually and in different sequences to prevent attackers from finding vulnerabilities. Writing API tests and periodically running the automated tests can help in the long run.
  4. Rate Limit: Ensure sensitive (actually all) apis cannot be hit beyond a certain number of times from a public interface. Ensure the same also blocks random IPs. If you have a business in India catering to Indian consumers, see that you have geographic-specific rate limiting as well.
  5. Encryption: Encrypting or hashing sensitive parameters go a long way. During my research, Vodafone was doing this very well.
  6. Client Side vs. Server-Side Applications: Understand the basic architectural differences of both applications. Dish TV has also implemented a similar flow as Tatasky. However, because they are using a .net server-side rendered application that from the cursory look is not api driven, it was not easy to replicate the same findings. However, you can still see the name against any mobile number on their website as well — but its a smaller security risk since bulk scripts cannot be written and executed.
  7. Implement Captchas on Open Forms: Captchas on open forms are absolutely important to ensure other types of attacks like SQL injection don’t take place via scripts.