Merge pull request #3955 from davidfischer/docs-advertising-details

davidfischer · web-flow · commit 8432800a6c5b · 2018-04-18T06:07:15.000-07:00
Add advertising details docs
diff --git a/docs/advertising-details.rst b/docs/advertising-details.rst
@@ -0,0 +1,130 @@
+Advertising Details
+===================
+
+.. NOTE: This document is linked from:
+.. https://media.readthedocs.org/javascript/readthedocs-analytics.js
+
+Read the Docs largely funds our operations and development through advertising.
+However, we aren't willing to compromise our values, document authors,
+or site visitors simply to make a bit more money.
+That's why we created our
+:doc:`ethical advertising <ethical-advertising>` initiative.
+
+We get a lot of inquiries about our approach to advertising which range
+from questions about our practices to requests to partner.
+The goal of this document is to shed light on the advertising industry,
+exactly what we do for advertising, and how what we do is different.
+If you have questions or comments,
+`send us an email <mailto:rev@readthedocs.org>`_
+or `open an issue on GitHub <https://github.com/rtfd/readthedocs.org/issues>`_.
+
+
+Other ad networks' targeting
+----------------------------
+
+Some ad networks build a database of user data in order to predict the types
+of ads that are likely to be clicked.
+In the advertising industry, this is called *behavioral targeting*.
+This can include data such as:
+
+* sites a user has visited
+* a user's search history
+* ads, pages, or stories a user has clicked on in the past
+* demographic information such as age, gender, or income level
+
+Typically, getting a user's page visit history is accomplished by the use of trackers
+(sometimes called beacons or pixels).
+For example, if a site uses a tracker from an ad network and a user visits that site,
+the site can now target future advertising to that user -- a known past visitor --
+with that network. This is called *retargeting*.
+
+Other ad predictions are made by grouping similar users
+together based on user data using machine learning.
+Frequently this involves an advertiser uploading personal data on users
+(often past customers of the advertiser)
+to an ad network and telling the network to target similar users.
+The idea is that two users with similar demographic information
+and similar interests would like the same products.
+In ad tech, this is known as *lookalike audiences* or *similar audiences*.
+
+Understandably, many people have concerns about these targeting techniques.
+The modern advertising industry has built enormous value by centralizing
+massive amounts of data on as many people as possible.
+
+
+Our targeting details
+---------------------
+
+**Read the Docs doesn't use the above techniques**.
+Instead, we target based solely upon:
+
+* Details of the page where the advertisement is shown including:
+
+  * The name, keywords, or programming language associated with the project being viewed
+  * Content of the page (eg. H1, title, theme, etc.)
+  * Whether the page is being viewed from a mobile device
+
+* General geography
+
+  * We allow advertisers to target ads to a list of countries or to exclude
+    countries from their advertising.
+  * We geolocate a user's IP address to a country when a request is made.
+
+Read the Docs uses GeoLite2 data created by `MaxMind <http://maxmind.com>`_.
+
+.. note::
+
+   We are considering expanding geographic targeting in the USA and Canada.
+   Because the USA and Canada are so large, we are considering allowing ads to be
+   targeted to a state or province or to a major metro area (DMA).
+   This document will be updated if that happens.
+
+
+.. _advertising-analytics:
+
+Analytics
+---------
+
+Analytics are a sensitive enough issue that they require their own section.
+In the spirit of full transparency, Read the Docs currently uses Google Analytics (GA).
+
+GA is a contentious issue inside Read the Docs and in our community.
+Some users are very sensitive and privacy conscious to usage of GA.
+Some authors want their own analytics on their docs to see the usage their docs get.
+The developers at Read the Docs understand that different users have different priorities
+and we try to respect the different viewpoints as much as possible while also accomplishing
+our own goals.
+
+Advertisers ask us questions that are easily answered with an analytics solution like
+"how many users do you have in Switzerland browsing Python docs?". We need to be able
+to easily get this data. We also use data from GA for some development decisions such
+as what browsers to support (or not) or how much usage a particular page or feature gets.
+
+We have taken steps to address some of the privacy concerns.
+Read the Docs instructs Google to anonymize IPs sent to them before they are stored.
+
+Alternatives
+~~~~~~~~~~~~
+
+We are always exploring our options with respect to analytics.
+There are alternatives but none of them are without downsides.
+Some alternatives are:
+
+* Run a different cloud analytics solution from a provider other than Google
+  (eg. Parse.ly, Matomo Cloud, Adobe Analytics).
+  We priced a couple of these out based on our load and they are very expensive.
+  They also just substitute one problem of data sharing with another.
+* Send data to GA (or another cloud analytics provider) on the server side and
+  strip or anonymize personal data such as IPs before sending them.
+  This would be a complex solution and involve additional infrastructure,
+  but it would have many advantages. It would result in a loss of data on
+  "sessions" and new vs. returning visitors which are of limited value to us.
+* Run a local JavaScript based analytics solution (eg. Matomo community).
+  This involves additional infrastructure that needs to be always up.
+  Frequently there are very large databases associated with this.
+  Many of these solutions aren't built to handle Read the Docs' load.
+* Run a local analytics solution based on web server log parsing.
+  This has the same infrastructure problems as above while also
+  not capturing all the data we want (without additional engineering) like the
+  programming language of the docs being shown or
+  whether the docs are built with Sphinx or something else.
diff --git a/docs/ethical-advertising.rst b/docs/ethical-advertising.rst
@@ -33,9 +33,6 @@ or seeing paid ads if you want.
 You will still see :ref:`community ads <community-ads>`,
 which we run for free that promote community projects. 
 
-We have gone into more detail about our views in our `blog post <https://blog.readthedocs.com/ads-on-read-the-docs/>`_ about this topic.
-Eric Holscher, one of our co-founders `talks a bit more <http://ericholscher.com/blog/2016/aug/31/funding-oss-marketing-money/>`_ about funding open source this way on his blog.
-
 .. _ethical-info:
 
 Our worldview
@@ -94,6 +91,21 @@ in a way that makes us feel good.
 
 .. _fake ad clicks: https://en.wikipedia.org/wiki/Click_fraud
 
+Additional details
+------------------
+
+* We have additional documentation on the
+  :doc:`technical details of our advertising <advertising-details>`
+  including our use of analytics.
+* We have an `advertising FAQ`_ written for advertisers.
+* We have gone into more detail about our views in our
+  `blog post <https://blog.readthedocs.com/ads-on-read-the-docs/>`_ about this topic.
+* Eric Holscher, one of our co-founders
+  `talks a bit more <http://ericholscher.com/blog/2016/aug/31/funding-oss-marketing-money/>`_
+  about funding open source this way on his blog.
+
+.. _advertising FAQ: https://readthedocs.org/sustainability/advertising/faq/
+
 Join us
 -------
 
diff --git a/docs/index.rst b/docs/index.rst
@@ -59,6 +59,7 @@ Information about development is also available:
    gsoc
    code-of-conduct
    ethical-advertising
+   advertising-details
    sponsors
    open-source-philosophy
    story
diff --git a/media/javascript/readthedocs-analytics.js b/media/javascript/readthedocs-analytics.js
@@ -1,21 +1,5 @@
-// Google Analytics is a contentious issue inside Read the Docs and in our community.
-// Some users are very sensitive and privacy conscious to usage of GA.
-// Other users want their own GA tracker on their docs to see the usage their docs get.
-// The developers at Read the Docs understand that different users have different priorities
-// and we try to respect the different viewpoints as much as possible while also accomplishing
-// our own goals.
-
-// Read the Docs largely funds our operations and development through advertising and
-// advertisers ask us questions that are easily answered with an analytics solution like
-// "how many users do you have in Switzerland browsing Python docs?". We need to be able
-// to easily get this data. We also use data from GA for some development decisions such
-// as what browsers to support (or not) or how much usage a particular page/feature gets.
-
-// We have taken steps with GA to address some of the privacy issues.
-// Read the Docs instructs Google to anonymize IPs sent to them before they are stored (see below).
-
-// We are always exploring our options with respect to analytics and if you would like
-// to discuss further, feel free to open an issue on github.
+// For more details on analytics at Read the Docs, please see:
+// https://docs.readthedocs.io/en/latest/advertising-details.html#analytics
 
 
 // RTD Analytics Code