News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

simutrans.com automatic profiling

Started by prissi, March 06, 2014, 09:34:03 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

prissi

Under http://www.domsreport.com/search.php?site=simutrans.com is a site which get various information of websites. For a site, which is number 277000 in a world ranking, the description seems very accurate:

Quote
Simutrans.com is ranked #277,341 in the world according to the three-month Alexa traffic rankings. The site has been online since 2003. Compared with the overall internet population, its audience tends to be male; it also appeals more to low-income, childless users browsing from school and home. We estimate that 13% of Simutrans.com's visitors are in Taiwan, where it has attained a traffic rank of 9,706. Visitors to the site spend approximately 39 seconds on each pageview and a total of five minutes on the site during each visit.

Spacethingy

Quoteit also appeals more to low-income, childless users

How the heck do they tell that?!
Life is like a Simutrans transformer:

You only get one of them, and you can't have it on a slope.

An_dz

Cookies my friend, cookies and all the power of tracking.

They know the videos you watch, the sites you visit, the products you buy and much more. This data gives an almost perfect profile of every person in the world.

IgorEliezer

#3
Quote from: An_dz on March 06, 2014, 10:39:20 PM
Cookies my friend, cookies and all the power of tracking.
For those who don't believe and are Firefox users: there's a featured Firefox addon called Lightbean. Install it, use the Internet for a day as usual, check Lightbean report and let your jaw make a hole in the floor.

jamespetts

I like cookies, especially chocolate chip.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Sarlock

It doesn't take much data to create a fairly accurate profile, just based on the sites you visit.

Just think of the data your government has on you......
Current projects: Pak128 Trees, blender graphics

Ters

Quote from: An_dz on March 06, 2014, 10:39:20 PM
Cookies my friend, cookies and all the power of tracking.

They know the videos you watch, the sites you visit, the products you buy and much more. This data gives an almost perfect profile of every person in the world.

What I find more puzzeling is how they get stats for simutrans.com. There are no ads or other third-party content as far as I can see, so they can't use cookies for that, unless you an_dz is part of the conspiracy. Cookies are passive, they don't do anything but hold onto some value. (Cookies are limited to a domain, so they can only track you as long as the page contains some content from the tracking site. That content doesn't have to be visible, though.)

But it is curious how they found out which surfing habits correspond to various ages, genders and marital status. I guess Facebook made that easier.

IgorEliezer

Quote from: Ters on March 07, 2014, 06:08:08 AMBut it is curious how they found out which surfing habits correspond to various ages, genders and marital status. I guess Facebook made that easier.
When you click on a link posted on this forum (hosted at simutrans.com), the link carries not only the request to open the destination page but also the origin, the page where the link you clicked was posted. This forum is full of links to youtube, imageshack, imgur, facebook and the hell. Does it help? I still prefer the an_dz as part of the conspiracy idea, it's easier.

ӔO

Google can take a guess at your gender and age with not only your cookies, but also your search history if you are logged in to their services.
My Sketchup open project sources
various projects rolled up: http://dl.dropbox.com/u/17111233/Roll_up.rar

Colour safe chart:

An_dz

Quote from: Ters on March 07, 2014, 06:08:08 AM
What I find more puzzeling is how they get stats for simutrans.com. There are no ads or other third-party content as far as I can see, so they can't use cookies for that, unless you an_dz is part of the conspiracy.
No, no, I'm not part of the conspiracy. There are no cookies or trackers in simutrans.com. We do collect data, but only the very basic provided by your [ur=https://en.wikipedia.org/wiki/HTTP_header]HTTP header[/url] and this data is collected by us and only we have access to it. So basically we can't track who's accessing the site.

Quote from: Ters on March 07, 2014, 06:08:08 AM
Cookies are passive, they don't do anything but hold onto some value. (Cookies are limited to a domain, so they can only track you as long as the page contains some content from the tracking site. That content doesn't have to be visible, though.)
You can collect data with javascript and send it to a cookie, you can create a script that add data in a cookie depending on where the user click, you can even create one to check if the person has scrolled the page, etc. And it's not hard to read the data from the cookie and send it to whoever you want.

Quote from: Ters on March 07, 2014, 06:08:08 AM
But it is curious how they found out which surfing habits correspond to various ages, genders and marital status. I guess Facebook made that easier.
Facebook, Google+ and all social medias are helping a lot the tracking. You know that little like or +1 button in the page? It's tracking you. Even if you don't have an account for this service, have an account just helps them collect better data.

And it's not hard finding out the person profile, they check the times of the day you access the sites, what brings your attention and they can create a very consistent profile. For example, someone with a baby or wanting one will browse trough kids related stuff, but if the person opens too much kids cartoons, this person has a kid. And this goes on what articles you have read, which brought more your attention and the list goes on.

And the tracking nowadays is not limited to be web. Google Chrome is the best tracking mechanist I have ever seen. They know what url you typed, they know what urls you opened even if there are no trackers on that page and they know when you opened, closed or minimized the browser, they know what stuff you clicked and used in the browser. Oh and of course you can disable some of these, but not everything. And since Chrome is now the most used browser, they basically know everything in the world.

Ters

Quote from: IgorEliezer on March 07, 2014, 11:21:35 AM
When you click on a link posted on this forum (hosted at simutrans.com), the link carries not only the request to open the destination page but also the origin, the page where the link you clicked was posted. This forum is full of links to youtube, imageshack, imgur, facebook and the hell. Does it help? I still prefer the an_dz as part of the conspiracy idea, it's easier.
That still doesn't reveal my age in itself.

Quote from: An_dz on March 07, 2014, 03:34:53 PM
You can collect data with javascript and send it to a cookie, you can create a script that add data in a cookie depending on where the user click, you can even create one to check if the person has scrolled the page, etc. And it's not hard to read the data from the cookie and send it to whoever you want.
As I wrote, cookies don't do anything. You need something else to do something to them. That was my point.

Quote from: An_dz on March 07, 2014, 03:34:53 PM
And it's not hard finding out the person profile, they check the times of the day you access the sites, what brings your attention and they can create a very consistent profile. For example, someone with a baby or wanting one will browse trough kids related stuff, but if the person opens too much kids cartoons, this person has a kid. And this goes on what articles you have read, which brought more your attention and the list goes on.
My browsing habits may confuse them big time, which might really be the case judging from some of the ads they throw at me. Either that, or there's a lot of strange people out there. I've more often than I like to admit read maternity forums (thanks to Google), despite being neither female nor a parent (a somewhat typical simutrans.com visitor in other words), because for some reason they contain answers to all kinds of things completely unrelated to children. One of the last things I found the answer to on such a site was a tax question.

prissi

To conclude they get most of their answers from the like and google+ buttons of the pages?

Ters

Quote from: prissi on March 07, 2014, 09:06:00 PM
To conclude they get most of their answers from the like and google+ buttons of the pages?

I won't be surprised if Google Ads and Google Analytics gives Google lots of data as well. Google Analytics is after all made to gather data, and although it is for the owner of the website, Google likely uses the data themselves once they have it. Google Analytics is normally invisible (unless it malfunctions).

Such buttons on simutrans.com seems to be stored locally and therefore won't give away any information by themselves.

prissi

The buttons yes; but clicks on those are processed via Facebook etc. where all those informations are mostly already there, told by the user. (And by your credit card also google will have very detailed personal informations if you have an Android phone).

IgorEliezer

Quote from: Ters on March 07, 2014, 05:56:30 PM
That still doesn't reveal my age in itself.
First I'd replace "my" with "our". It's not so easy to collect all data from everyone all the time. There'll be always 5-25% of "unknown" when it wasn't possible to collect gender, age, country, browser etc from a user.

Even if simutrans.com has no trackers, other domains may have them and can "guess" the public from simutrans.com. Say for example, you have profile cookies from Facebook, when you click on a Facebook link placed on a simutrans.com page, they will know you came from simutrans.com. Cross-data is just a step away.

Ters

Well, since I don't click on such buttons, they won't track me then. Then again, I don't visit www.simutrans.com at all. Tapatalk, or whatever it is called, might track me on this forum, though.

prissi

The browser may read ahead pages (and thus links) even if you do not click them at all. Same for the official like it button (not ours) which downloaded (at one time) over 100 kB code from facebook.

isidoro

Well, I think there is an antidote for all this poison: lying.  Once you corrupt the database, the database is of no use...   8)   Noise or entropy is the cure, since you can't prevent them from gathering data.

Click those google +1 buttons at random, visit products you don't like in online tracking stores, disclose false data about yourself, use proxies or the Tor network, give some bandwidth for free to your neighbours...

Nevertheless, the information they gave is not very accurate about myself (50%), not much than pure random.  My grandmother or a magic 8-ball would do better indeed.

An_dz

Quote from: isidoro on March 09, 2014, 01:16:24 AM
disclose false data about yourself, use proxies
That's what I do. :evil laugh:


sdog

QuoteWell, I think there is an antidote for all this poison: lying.  Once you corrupt the database, the database is of no use...   8)   Noise or entropy is the cure, since you can't prevent them from gathering data.

doesn't work, for several reasons:

The false information would be random. The interesting information is not. This can be found in large enough statistical samples. The single person is not of very much interest anyways.

Unless you automate it, or completely replace your interests by faux interests. it is not doable as you'd have to spend more time of faking your internet behaviour than actually using the net.

And if they don't get it from you. They get it from others that are like you. Living in the same type of neighbourhood, having a similar job, age, education, etc the profile is most likely rather accurate.

Even if they couldn't do that, they'd get you through your social network. (Not as in facebook but as in social) but email addresses stored, searches for you typed into facebook and linkedin. By knowing who knows you, it is possible to know you very well too.

Ters

Quote from: sdog on March 09, 2014, 07:50:37 AM
And if they don't get it from you. They get it from others that are like you. Living in the same type of neighbourhood, having a similar job, age, education, etc the profile is most likely rather accurate.

That I can live with. It is just good old prejudice.

isidoro

@sdog: that reminds me of polls about what party will one vote in next election.  They also ask you about what party you vote in the last election.  They do that to have a measure of how many people lie (they have statistics about the last election results, of course).  The right thing is to lie in the former question and say the truth in the latter...   ;)

And about your post, it's all a matter of being majority the ones that lie... So, join us.  Or maybe an army of faking bots, tricking the data collector bots...  Then cannonball vs wall story again...