(Originally posted on May 28, 2021)
Random Observation/Comment #716: We’re always talking to the future. Anything we write can be an archive for a future generation.
This is the age of data collection, so it makes sense to be a little bit more aware of the things you input into the interwebs… a series of tubes. I’ve done this exercise to map the data I create in order to think more about privacy and whether or not the monetary aspects are worth protecting.
The short answer is Google. I would have reached 30 way too quickly, so this is just a general mapping. I’ve tried to add thoughts on what could be useful with my data.
(This has been extremely valuable thinking through what can be trained about myself)
Google
Platform usage (if collected) – On Android, from Chrome browser cookies, from general phone usage and types of apps
For good: Digital well-being health and app recommendations
Searches and subsequent clicks on:
Emails – Seems like more and more of this is just delete/archive after reading
Web pages – Maybe a few searches a week off of random thoughts to lead down a Wikipedia hole
Maps physical locations – I do a lot of Google Maps searching because I like the idea of looking at different locations
Files on gdrive – Not much stored here of importance
Scholar papers – Good subset of material specific to research papers
YouTube Videos – Mostly my search for British comedies or Magic - not sure if I want to share all things that I’ve watched into a training set
Searches via voice
Asking Google what this random song is – You’re probably always listening
Email scraping – They’ve learned a lot in auto correct and auto fill! Who’s to say they’re not also looking at the context? This is less interesting because most of my emails are pretty short in response.
Reviews and images uploaded to Maps – I liked being a Google Guide – I’d love to keep getting perks for contributing my opinions about restaurants.
Location info as I move around with my phone – Probably the most data about my pinpoint location and activities (even within a house if they cared enough)
Data storage and backup via gdrive – Everything is backed up somewhere
Web-based spreadsheet and slides making – Lots of slides and not sure any of the slide-making is interesting data
News articles read via Google News – I balance my news intake decently well, but I do like to see what’s being pushed through
Videos watched on YouTube – Especially in the shorts, there’s just a lot more data points when you create your own preference based on how these videos are tagged, sorted, and recommended between different bubbles.
Payment using gpay – Not much gpay usage outside of storage purchases, which are usually some software subscriptions like Canva
Note taking – Although I wonder if they actually look at content – Not much they’d want to see unless they’re looking for some lists of 30. I love Google Keep.
Facebook / Social Networks
Personal images and videos uploaded – Trying to reduce these with people’s faces because there will be so much deep fake tech out there.
Personal comments likely no one cares about – I can’t imagine social networks caring about specifically the comments made on regular photos (unless these are stories that get more attention)
Attention kept on certain pages or stories – This is probably a huge portion of the app data collected. I’m sure they can tell how long I spent scrolling past certain ads or posts in order to recommend similar ones.
Likes per type of account or specific posts – Not too worries about this because I just like friend’s cute photos and travel
Amazon
Anything I search for explicitly – Based on searches and purchases, I’m sure they suggest different ads and products that could be accessories. I don’t think this is a bad thing.
Anything I buy once or recurring – Purchases themselves in a recurring manner could be of interest as they create more solid networks of choices between items.
Whenever I talk around my Alexa – I have no doubt the echo is listening on random things.
Microsoft
LinkedIn data specific for job searching or social network posts – Job searching is probably an important data point some people may want to keep hidden.
Android Keyboard strokes for predicting words – Microsoft SwiftKey – I am most paranoid about predictive text usage on keyboards. They know too much! Remember to always wipe your clipboard.
AT&T & Local Internet (depending if I’m using WiFi, 5G or text messages)
Web traffic – Either through apps or directly through http requests on browsers, I’m sure there’s some traffic being monitored that’s using your WiFi or network
SMS text messages – I’m assuming none of the SMS messages I send are private and can be subpoenaed by a court of law
Possible location data – Likely being sold or shared as data for monitoring traffic or some other set of services
Streaming services (Netflix, Hulu, HBO max, Disney+, etc)
What we’ve watched –> Recommending shows – Not the worst thing out there. I don’t see the shows I watch mentally overlapping with things I buy or even things I like to talk about.
How long we’ve watched and during which time of the day – The only thing they’ll know is we watch more kids shows in the morning and less kids shows at night.
Podcast and audio (Stitcher, Audible, Spotify)
What I listen to –> Recommending shows – I haven’t found anything too devious about the audio medium. It would be nice to get more honed podcast ads instead of just ones about mattresses and food delivery services.
Reddit
Worthless internet points and upvotes – I would say Reddit no longer is an open news source with all the bots and flooding of uptrends for certain subreddits. I’ll just stick to random Life Pro Tips.
~See Lemons Get Data Mined