KDD 2019September 6, 2019
I had the opportunity to go to KDD in Anchorage this past August. Since my team has incrementally moved our systems away from Python as our main language, I asked my manager whether it’d make sense for me to go to PyCon again (see my notes from 2016 and 2018), or if there’d be a more relevant conference that I should look out for this season. She suggested a few options, and KDD was one of them. I didn’t know much about it, but its focus on analytics meant that there would surely be pieces that were relevant to my day to day, and since it was Apple’s first year sponsoring it, there was extra motivation to attend.
The fact that it was held in Alaska was another selling point - almost every day after the conference, I did some hiking around Anchorage, and I went to Denali the weekend after. Check out the full photo album on my other blog post.
I met a bunch of interesting people there, and I particularly enjoyed the closing ceremony dinner, as I ended up talking with a pretty diverse group of people. Our conversations touched on cultural differences and ethical problems of computer science. We talked for hours, and the topics spanned everything from the and the implications of artificial intelligence on what we consider to be a person, to objectivity and its sources, and more. That evening was great.
Most of the talks and events that I went to at the conference were related to system evaluation, experimentation frameworks, and classic information retrieval/search problems. A lot of it went over my head, but that was the point of being there! Below are some of my notes on the most interesting papers I learned about and events I attended.
AB Experimentation/System Evaluation
Fundamentals of large-scale sequential experimentation - Aaditya Ramdas from Carnegie Mellon on statistical robustness across experiments over time. They’re reviving the study space of confidence sequences and always-valid p-values. Dealing with problems of peeking early and of the relationship between experiments. Additionally, his team is studying a system to deal with sequential control of expected false discoveries via alpha budgeting. This was one of my favorite talks of the whole conference, even if somewhat idealistic.
The Anatomy of a Large-Scale Online Experimentation Platform - Microsoft now has a 3rd party experimentation platform for Azure customers. This paper discusses a lot of their architecture decisions. I had lunch one day with Somit, the lead author, and he walked me through some of the challenges they have been dealing with and how they’ve constrained their use cases to let the system itself correct for common biases in how the data is analyzed.
Challenges, Best Practices and Pitfalls in Evaluating Results of Online Controlled Experiments - Microsoft, Snap, FB, Outreach, discussing some of the problems they have on their experimentation systems.
Winner’s Curse - Airbnb paper on additive experiments.
We are interested in learning the aggregated impact of the launched features. In this paper, we investigate a statistical selection bias in this process and propose a correction method of getting an unbiased estimator
Network Effects - Google paper on users affecting each others and biasing results. Ya Xu from LinkedIn had a pretty good presentation on this, partially based on Detecting Network Effects: Randomizing Over Randomized Experiments.
Having multiple treatments increases false positives due to multiple comparison. Second, the selection process causes an upward bias in estimated effect size of the best observed treatment. To overcome these two issues, a two stage process is recommended, in which we select the best treatment from the first screening stage and then run the same experiment with only the selected best treatment and the control in the validation stage.
Top Challenges from the first Practical Online Controlled Experiments Summit - Survey paper from a summit with Airbnb, Amazon, Booking, Facebook, Google, LinkedIn, Lyft, Microsoft, Netflix, Stanford, Twitter, Uber, Yandex attendees, discussing all the various problems they’re seeing in their platforms.
Common Metric Interpretation Pitfalls in Online Controlled Experiments - Metric interpretation issues that a platform approach should try to correct for.
Building Automated Feature Rollouts on Robust Regression Analysis - Uber paper on rollout and rollback. Regressions can be caught and attributed to a particular feature or owner via analytics system and decisions made on the fly based on guardrail metrics.
The Identification and Estimation of Direct and Indirect Effects in Online A/B Tests through Causal Mediation Analysis - A paper from Etsy on decomposing the direct and indirect effects of an AB’s treatment in various metrics.
Search specific papers
surface level text similarity results in many false positives where queries with different intents yet similar top- ics are mistakenly predicted as query reformulations. We propose a new representation for Web search queries based on identifying the concepts in queries and show that we can significantly improve query reformulation performance using features of query concepts.
Context-Aware Web Search Abandonment Prediction - Microsoft research on building classifiers for bad/good abandonments and using that to improve relevance.
Intervention Harvesting for Context-Dependent Examination-Bias Estimation and Estimating Position Bias without Intrusive Interventions Both of these come from Thorsten Joachims, from Cornell. Using propensity estimation and randomized interventions from previous interactions to improve ranking.
Want to see more articles like this? Sign up below: