Tips & more

Ebooks and privacy – here are things ebook stores may learn about you

Ebooks and privacy - what data ebook platforms may collect
Image by Nikita Buida / Freepik

Information which book you have just bought is just the beginning of a long list of personal data ebook stores may collect and process.

Reading books is seen as one of the most personal and private life activities. You escape from the real world to dive into another one anyone else could learn about only if you share it with them.

With the growing number of concerns about online privacy, you assume that your sensitive data may be at risk when you use a Facebook app or visit a website which displays Google Ads.

Do you assume you sail to a secure haven the moment you start reading an ebook? You shouldn’t. Ebook platforms collect a vast amount of data about what you read, how you read, and what you think. They may process it to adjust their offer to your needs – and who knows what else for.

Obviously, everything depends on an ebook store you are connected to and a device or app you use. We may, however, assume that ebook platforms can collect similar amount of information as other services or apps.

In other words, the Kindle app for iPad can have access to all user data provided by iOS (operating system for iPad and iPhone), similarly to Facebook or Twitter. In this regard, there is no difference between the book-reading and social app.

Ebook stores need some information about you for the sake of convenience. Thanks to collected data, you can have the content of your ebook library synced between devices or find a next interesting book much easier.

But do they need all the information they collect? Do they process it? Is this information processed anonymously? Is there a human operator involved at any stage?

If you want to have 100% privacy while reading books, go to the nearest bookstore and buy a printed book. If you want to read books conveniently, stick to ebooks but accept lower privacy level and be aware what traces you leave.

What private data ebook platforms may collect

Books you buy

In the era of online shopping, it’s something usual and commonly accepted. Any purchase you make is being recorded and associated with your account.

Some say that the fact someone else may know what books you buy marks the end of your privacy. When you compare that to how you used to buy print books (an anonymous handsome person enters a cozy bookshop), it may sound right. But in the era of web shopping, a vast majority of users don’t have a problem with that any longer.

If you buy a print book on Barnes & Noble’s web store, the retailer knows about it as well – and you accept it because otherwise you would not be able to complete the purchase.

How could this info be used?

  • To learn which book genre and author you like the most – just look at your purchase history. It’s all in there: all the books you added to your account since you registered a few years ago. There are dozens, hundreds of them. Your ebook platform sees the same list – but has powerful tools to process this data in many complex ways. These guys may know more about your reading preferences than you do!
  • To learn which language is your mother tongue – if you are a user of an ebook platform that sells books in multiple languages.
  • To learn whether you rely on reviews and ratings – you don’t see the whole picture because you count on reviews only at the moment of purchase. It may turn out that 80% ebooks you bought were rated at least 4.5/5 and came with at least 100 reviews. It’s something that tells the ebook store to recommend you only the books with the similar or better rating status.
  • To learn whether you are sensitive to price – the price of the book at the moment of purchase can be (and probably is) recorded to the system. Barnes & Noble may learn that 75% of your Nook books were purchased by you for $2.99 at the time these books were featured on Nook Daily Find.
  • To learn which price level you accept – if you bought several books for a regular price, let’s say between $12.99 and $19.99, why should the ebook store recommend you the books that cost $4.99?
  • To control subscription-based downloads – Kindle Unlimited allows the subscriber to download to connected devices up to 10 books. If you have two devices (or apps) with 5 books on one of them, you won’t be able to download the 6th title to the other device until you remove one of the exiting Kindle Unlimited books.
  • To display more effective ads and book recommendations – in the end, it all boils down to make you buy more stuff. Based on your purchase history, the ebook platform will recommend you books you are more willing to buy, instead of showing millions you will never be interested in. Also, you will see promotions or get emails that are targeted to your user profile.

Books you read

Analyzing the user’s purchase history is more than enough to effectively increase the chance of your next purchase. Ebook platforms don’t need anything else to generate more sales, do they?

Wait. How many titles from your vast ebook collection have you actually read? 100%? Probably not. 50%? Yes, including the books you abandoned. 20%? For sure, probably yes, quite possible.

The fact you buy a book does not mean you read it. The difference between a print and digital book is that the latter one enables the seller to learn what happens after the purchase.

Let’s say you buy ten books in the Kindle Store because they were featured in Kindle Daily Deal and are from your favorite category. Their titles, prices, ratings, and other data are being recorded. Which book you pick up first? This is also being recorded.

Why have you picked this particular book and not one of the nine remaining? What’s in it that made you choose it? Does it have to do with the author, topic, or anything else?

Any time you open a particular ebook on your e-reader or book app, you create a data row in your activity sheet somewhere on a server of your ebook provider. Why would they recommend you books related to the ten you bought if they could only show you books related to the one you actually read?

How could this info be used?

  • The same way as with what books you buy – but this time with a much higher possibility of turning the ebook platform’s efforts into your purchase.

Own books and documents

The same thing happens with the books you obtain outside your ebook platform but add them to your account.

Almost half of books I keep in my Kindle library in the cloud are bought in Polish-language ebook stores. After the purchase I add the book manually to the Kindle by sending a file as an attachment to an email address associated with my Amazon account. I also have multiple classic novels I downloaded for free from Project Gutenberg.

The moment you open a Project Gutenberg book on your Kindle or Kobo or Google Play Book app and sync it is the moment this book starts to be processed by the ebook platform.

How could this info be used?

  • To learn the source of books – whether the book comes from one of legal sources of the public domain books, another ebook platform, or an unknown source.
  • To learn your mother tongue – and recommend the books in your language or related to your country of origin.
  • To learn whether you are price sensitive – if most of your side loaded books come from free sources, you are probably not ready to buy ebooks are their regular prices.

Devices or apps you use

Have you ever wondered why Amazon knows you have bought the “2d Kindle” or “4th Fire”? You don’t have to do anything to let Amazon learn that. When you sign in to a Kindle app on a new Android phone or Kindle, this information (much more detailed than just the name of the device, model, and operating system) is being sent automatically.

“A device is a device, why worry,” you may think. Just like with ebooks, the history of connected devices can tell a story about you. Let’s compare three users.

Martha Adam Emily
Kindle Keyboard
Kindle
Kindle Paperwhite 2
Kindle Paperwhite 4
Kindle Touch
Fire HDX 8.9
Kindle
Fire 7
iPad mini
iPad Pro 12.9

Martha is an avid reader who will continue buying Kindle e-readers. She will be willing to benefit from deals on Kindle devices and ebooks.

Adam stopped reading Kindle books long time ago.

Emily probably won’t buy the Kindle again. She is getting wealthier, is less sensitive to price, and ready to enjoy media-rich books. The chances are high she may switch the ebook platform.

How could this info be used?

  • To adjust book recommendations – if you are a Kindle user, you are most willing to buy novels – why not show them on a special offers screensaver?; if you are reading on an iPhone, you may be willing to try an audiobook companion to the Kindle book; if you are using a 10-inch Amazon Fire, you may consider trying interactive cookbooks; if you have connected an Echo speaker, at some point you will want to try Audible membership.
  • To prevent you from going to another ebook platform – switching from a dedicated e-reader to a tablet with a connected app is the first step to lose the user. The ebook platform may decide to offer you special incentives to keep you.
  • To offer relevant subscription-based services – a Kindle user may want to buy a pre-paid Kindle Unlimited plan, but if you use an Amazon Fire tablet, you may want to take advantage of Prime Reading instead.

Devices and apps you use at the same time

A history of connected devices can say a lot about you. The ebook platform may learn even more by analyzing how the currently connected devices are being used.

Tracking daily habits is something we will deal with in the next section. Right now, let’s focus on how multiple device usage affects the way the ebook store thinks about us.

This information is especially important in the era of account sharing. A single user account with three or four connected devices and apps… Hmm, is this still one person behind it? Don’t assume it would be difficult for the ebook store to figure it out. Just the opposite. It shouldn’t be hard to develop an algorithm that compares the usage and draws automated conclusions.

If two devices are being actively used at the same time, it means there are at least two people connected to the single account. If you compare it with books that are being read on these devices, you can learn who these people are – whether they are at the same age (wife and husband) or different age (parent and child).

How could this info be used?

  • To figure out which device is used by the actual payer – why display an offer to the family member who is not eligible to make purchases?
  • To learn which device is used by the heaviest user – it’s this person who the ebook platform’s efforts should also be addressed to, even if this person doesn’t click or tap the “Buy” button.
  • To offer subscription-based services – there is no better opportunity to offer an ebook subscription: every member of the group can get unlimited access to the books she or he wants to read. If the platform learns your account is used mostly by kids, you may see an invitation to subscribe to kid-friendly membership plans.

Your reading habits

I realize that you may be aware of most of the things you read here, but my goal is to put everything together to show the scale of possible data collection.

Again, it all depends on the ebook platform and operating systems powering apps and devices. However, we can assume that, from a technical point of view, all the following bits of data can be collected, and there is no reason not to process them.

  • What time you open the app and what time you close it
  • How many times a day you open the app
  • How long you read in one session
  • What is the time between you turn the pages
  • What highlights you make
  • What notes you write
  • What passages of text you share
  • How often you look up words and what these words are
  • How often you use the built-in dictionary and what this dictionary is
  • What are your preferred settings, how often you access them
  • Which theme you prefer, whether you switch themes, and whether it depends on the time of the day

I read mostly on my iPhone using the Kindle app. An evening session, before I go to bed, lasts usually half an hour. Then, I fall asleep with the iPhone in my hand. During a day, I usually manage to read in small chunks, between 5 and 20 minutes, depending on my timetable.

You have learned that because I had made a deliberate decision to share this private information with you.

The thing is that Amazon can collect and process this information automatically. I don’t feel comfortable with this, and I would like to have a chance to control which data I share with the provider of the service.

How could this info be used?

Small bits of information on how you use a book-reading app or an e-reader could be used in various ways. Obviously, there is no proof they are being actually used this way. What you will see in the following list, however, can show the scope of possibilities.

  • To figure out what time is the best to display offers – what is the best time to send an offer? The moment the user is actually using the service. If the ebook platform learns that you are reading for two hours every Sunday evening, in their interest would be to send you the offer at this day and this time.
  • To learn whether you have a job – if you read twice a day, one hour in the morning and one hour late afternoon, it may suggest you are a daily commuter and you read in public transport on your way to and from work.
  • To learn what time do you fall sleep – no matter how insane this may sound, it is actually easy to figure out. When you stop using an app, you usually close it or lock the device. If you don’t do that, the app and/or device automatically go into sleep mode. When you read in the evening and the device goes into sleep mode, it may suggest you have just fallen asleep.
  • To learn whether you have vision problems – if the font is set in your book reading app to a size that’s much larger than usual, it may suggest you would need reading glasses but still don’t use them
  • To count subscription-based books as read – this is actually something Amazon is admitting to collect and process. Any Kindle Unlimited or Prime Reading book is counted as read if the user reads about 15% of the content.
  • To learn why users abandon reading a particular book – if you stop reading a book on a certain page, don’t go back to it, and delete it from the device, it’s a signal there is something wrong with the text on this particular page. Imagine that many users abandon reading the same book exactly at the same paragraph. If you were an author, would you like to know it, so that you could modify the text in order to reduce abandonment rate?

Your notes and highlights

The fact you buy a book about Donald Trump doesn’t mean you are a Trump voter. But passages of text you highlight and notes you make can already say it.

Many ebook platforms offer a very convenient feature that lets sync highlights and notes, so that you could access them even if you have archived the book or don’t have access to the e-reading device or app. It means that your personal opinions are being stored in the cloud, just like emails you sent from Gmail. It obviously doesn’t mean that someone else is reading your highlights, but they are not as private as the highlights you make in your print book, that’s for sure.

Notes and highlights are your very private matter, unless you decide to share them. There is a huge difference between what you write for social media or in an email and what you highlight in a book. In the first case, you share the text that is intended to create a desired reaction. You assume other people will read it; you want them to read it.

The highlight you make for yourself is far better reflecting your real opinion than the text you share with others.

Would you like to know all highlights Bill Gates makes in books he reads? Not only the ones he shares on Twitter, but also the ones he highlights to memorize?

Would an ebook platform ever tend to process your highlights? This is already happening. For some time, Amazon offers a feature called Popular Highlights, which enables any Kindle user to see which passages of books were highlighted by other users – and by how many of them.

How could this info be used?

  • To sync your highlights and notes – if you want to access your highlights from any device, not only the e-reader you use, you have to accept the fact that the highlights are being sent through the cloud server operated by the ebook platform just like your Gmail messages go through the Google server.
  • To learn about your true opinions – obviously, we are landing in a dystopian world now, but as there is a technical possibility to access your highlights, you never know who will read them. Truth is that you can build a profile of a person based only on highlights she or he makes. To me, the book highlights are like a secret diary composed of the text of the others.

Conclusions

Living in the digital era is about finding balance between activity and privacy. Between what you share and what you keep for yourself. Between the risk of your voice not being heard when you speak and the risk of being talked about when you stay quiet.

Social media services are highly advanced in informing users about privacy conditions. They put a lot of effort into guiding their users through conditions of use, and offer several options to choose the level of security and privacy.

In this area, ebook platforms are far behind. You can reach the highest level of privacy only when you log out of the system, when you deregister the app or device. There are no steps between the full convenience and full privacy. There is almost no way to control what data to share with the platform.

It has to change because internet users are getting more and more concerned about their online privacy and would need better tools to control it.

• • •

We removed a comment system to increase your privacy and reduce distractions. If you’d like to discuss this article, we are waiting for you on Twitter, Facebook, and Pinterest. You can also follow us on Google News or grab our RSS feed.

Keep exploring: