Google, Facebook, and Twitter are platforms. So are some retail sites. What does that mean? It means that they provide the means for people to use their technology to create things for themselves. Most of the time, this is a good thing. People can communicate in ways they never could before such platforms. Likewise, people can sell things to people they never could.
Now these platforms are in a bind, as you can see in this piece and in other places: Google, Facebook, and Twitter Sell Hate Speech Targeted Ads. They are in a bind partly due to their own approach, by boasting of their ability to use AI to stop such things. They should have been much more humble. AI as it currently stands will only take you so far. Instead of relying on things like AI, they need to have better governance mechanisms in place. Governance is a cost of organizations, and often times organizations don’t insert proper governance until flaws like this start to occur.
That said, this particular piece has several weaknesses. First up, this comment: “that the companies are incapable of building their systems to reflect moral values”. It would be remarkable for global companies to build systems to reflect moral values when even within individual nations there is conflicts regarding such values. Likewise the statement: “It seems highly unlikely that these platforms knowingly allow offensive language to slip through the cracks”. Again, define offensive language at a global level. To make it harder still, trying doing it with different languages and different cultures. The same thing occurs on retail sites when people put offensive images on T shirts. For some retail systems no one from the company that own the platform takes time to review every product that comes in.
And that gets to the problem. All these platforms could be mainly content agnostic, the way the telephone system is platform agnostic. However people are expecting them to insert themselves and not be content agnostic. Once that happens, they are going to be in an exceptional bind. We don’t live in a homogenous world where everyone shares the same values. Even if they converted to non-profits and spent a lot more revenue on reviewing content, there would still be limits to what they could do.
To make things better, these platforms need to be humble and realistic about what they can do and communicate that consistently and clearly with the people that use these systems. Otherwise, they are going to find that they are going to be governed in ways they are not going to like. Additionally, they need to decide what their own values are and communicate and defend them. They may lose users and customers, but the alternative of trying to be different things in different places will only make their own internal governance impossible.
There is so much wrong in this article, The Real Bias Built In at Facebook – The New York Times, that I decided to take it apart in this blog post. (I’ve read so many bad IT stories in the Times that I stopped critiquing them after a while, but this one in particular bugged me enough to write something).
To illustrate what I mean by what is wrong with this piece, here’s some excerpts in italics followed by my thoughts in non-italics.
- First off, there is the use of the word “algorithm” everywhere. That alone is a problem. For an example of why that is bad, see section 2.4 of Paul Ford’s great piece on software,What is Code? As Ford explains: ““Algorithm” is a word writers invoke to sound smart about technology. Journalists tend to talk about “Facebook’s algorithm” or a “Google algorithm,” which is usually inaccurate. They mean “software.” Now part of the problem is that Google and Facebook talk about their algorithms, but really they are talking about their software, which will incorporate many algorithms. For example, Google does it here: https://webmasters.googleblog.com/2011/05/more-guidance-on-building-high-quality.html At least Google talks about algorithms, not algorithm. Either way, talking about algorithms is bad. It’s software, not algorithms, and if you can’t see the difference, that is a good indication you should not be writing think pieces about I.T.
- Then there is this quote: “Algorithms in human affairs are generally complex computer programs that crunch data and perform computations to optimize outcomes chosen by programmers. Such an algorithm isn’t some pure sifting mechanism, spitting out objective answers in response to scientific calculations. Nor is it a mere reflection of the desires of the programmers. We use these algorithms to explore questions that have no right answer to begin with, so we don’t even have a straightforward way to calibrate or correct them.” What does that even mean? To me, I think it implies any software that is socially oriented (as opposed to say banking software or airline travel software) is imprecise or unpredictable. But at best, that is only slightly true and mainly false. Facebook and Google both want to give you relevant answers. If you start typing in “restaurants” or some other facilities in Google search box, Google will start suggesting answers to you. These answers will very likely to be relevant to you. It is important for Google that this happens, because this is how they make money from advertisers. They have a way of calibrating and correcting this. In fact I am certain they spend a lot of resources making sure you have the correct answer or close to the correct answer. Facebook is the same way. The results you get back are not random. They are designed, built and tested to be relevant to you. The more relevant they are, the more successful these companies are. The responses are generally right ones.
- “ If Google shows you these 11 results instead of those 11, or if a hiring algorithm puts this person’s résumé at the top of a file and not that one, who is to definitively say what is correct, and what is wrong?” Actually, Google can say, they just don’t. It’s not in their business interest to explain in detail how their software works. They do explain generally, in order to help people insure their sites stay relevant. (See the link I provided above). But if they provide too much detail, bad sites game their sites and make Google search results worse for everyone. As well, if they provide too much detail, they can make it easier for other search engine sites – yes, they still exist – to compete with them.
- “Without laws of nature to anchor them, algorithms used in such subjective decision making can never be truly neutral, objective or scientific.” This is simply nonsense.
- “Programmers do not, and often cannot, predict what their complex programs will do. “ Also untrue. If this was true, then IBM could not improve Watson to be more accurate. Google could not have their sales reps convince ad buyers that it is worth their money to pay Google to show their ads. Same for Facebook, Twitter, and any web site that is dependent on advertising as a revenue stream.
- “Google’s Internet services are billions of lines of code.” So what? And how is this a measure of complexity? I’ve seen small amounts of code that was poorly maintained be very hard to understand, and large amounts of code that was well maintained be very simple to understand.
- “Once these algorithms with an enormous number of moving parts are set loose, they then interact with the world, and learn and react. The consequences aren’t easily predictable. Our computational methods are also getting more enigmatic. Machine learning is a rapidly spreading technique that allows computers to independently learn to learn — almost as we do as humans — by churning through the copious disorganized data, including data we generate in digital environments. However, while we now know how to make machines learn, we don’t really know what exact knowledge they have gained. If we did, we wouldn’t need them to learn things themselves: We’d just program the method directly.” This is just a cluster of ideas slammed together, a word sandwich with layers of phrases without saying anything. It makes it sound like AI has been unleashed upon the world and we are helpless to do anything about it. That’s ridiculous. As well, it’s vague enough that it is hard to dispute without talking in detail about how A.I. and machine learning works, but it seems knowledgeable enough that many people think it has greater meaning.
- “With algorithms, we don’t have an engineering breakthrough that’s making life more precise, but billions of semi-savant mini-Frankensteins, often with narrow but deep expertise that we no longer understand, spitting out answers here and there to questions we can’t judge just by numbers, all under the cloak of objectivity and science.” This is just scaremongering.
- “If these algorithms are not scientifically computing answers to questions with objective right answers, what are they doing? Mostly, they “optimize” output to parameters the company chooses, crucially, under conditions also shaped by the company. On Facebook the goal is to maximize the amount of engagement you have with the site and keep the site ad-friendly.You can easily click on “like,” for example, but there is not yet a “this was a challenging but important story” button. This setup, rather than the hidden personal beliefs of programmers, is where the thorny biases creep into algorithms, and that’s why it’s perfectly plausible for Facebook’s work force to be liberal, and yet for the site to be a powerful conduit for conservative ideas as well as conspiracy theories and hoaxes — along with upbeat stories and weighty debates. Indeed, on Facebook, Donald J. Trump fares better than any other candidate, and anti-vaccination theories like those peddled by Mr. Beck easily go viral. The newsfeed algorithm also values comments and sharing. All this suits content designed to generate either a sense of oversize delight or righteous outrage and go viral, hoaxes and conspiracies as well as baby pictures, happy announcements (that can be liked) and important news and discussions.” This is the one thing in the piece that I agreed with, and it points to the real challenge with Facebook’s software. I think the software IS neutral, in that it is not interested in the content per se as it is how the user is responding or not responding to it. What is NOT neutral is the data it is working off of. Facebook’s software is as susceptible to GIGO (garbage in, garbage out) as any other software. So if you have a lot of people on Facebook sending around cat pictures and stupid things some politicians are saying, people are going to respond to it and Facebook’s software is going to respond to that response.
- “Facebook’s own research shows that the choices its algorithm makes can influence people’s mood and even affect elections by shaping turnout. For example, in August 2014, my analysis found that Facebook’s newsfeed algorithm largely buried news of protests over the killing of Michael Brown by a police officer in Ferguson, Mo., probably because the story was certainly not “like”-able and even hard to comment on. Without likes or comments, the algorithm showed Ferguson posts to fewer people, generating even fewer likes in a spiral of algorithmic silence. The story seemed to break through only after many people expressed outrage on the algorithmically unfiltered Twitter platform, finally forcing the news to national prominence.” Also true. Additionally, Facebook got into trouble for the research they did showing their software can manipulate people by….manipulating people in experiments on them! It was dumb, unethical, and possibly illegal.
- “Software giants would like us to believe their algorithms are objective and neutral, so they can avoid responsibility for their enormous power as gatekeepers while maintaining as large an audience as possible.” Well, not exactly. It’s true that Facebook and Twitter are flirting with the notion of becoming more news organizations, but I don’t think they have decided whether or not they should make the leap or not. Mostly what they are focused on are channels that allow them to gain greater audiences for their ads with few if any restrictions.
In short, like many of the IT think pieces I have seen the Times, it is filled with wrong headed generalities and overstatements, in addition to some concrete examples buried somewhere in the piece that likely was thing that generated the idea to write the piece in the first place. Terrible.
Things I am interested in or working on these days: AI, WebSphere setup, Python, Twitter programming, development in general, configuring Netscalers, cool things IBM is doing, automation, among other things.
- If you have the AI bug and think you want to do some Prolog programming, you need this: What Prolog implementation to choose? What’s fastest? Compatibility?
- Deep Learning is hot in AI. If you want more info, this is good: Deep Learning Tutorials — DeepLearning 0.1 documentation
- Sigh. This debate never goes away in AI: Why AlphaGo Is Not AI – IEEE Spectrum
- More on the hysteria that AI brings: The founder of Evernote made a great point about why AI (probably) won’t kill us all – Vox
- Ignore most AI hysteria, but do read this: What does it mean for an algorithm to be fair? | Math ∩ Programming
- Want to whip up a quick mobile app? Consider: Mobile App Builder – new service now available – Bluemix Blog
- For power users, there’s: How to create an insane multiple monitor setup with three, four, or more displays | PCWorld
- Need virtual images? Take a look at this: Images | VirtualBoxes – Free VirtualBox® Images
- For hardcore WAS users, this is helpful: Installing optional Java 7.x on WebSphere Application Server 8.5 (Application Integration Middleware Support Blog)
- A classic. Anyone tuning WAS needs this: Case study: Tuning WebSphere Application Server V7 and V8 for performance
- Want to learn Python? Write your own Twitter client? Or do both? Then there’s this: How To Build a Twitter “Hello World” Web App in Python | ProgrammableWeb
- More on programming Twitter: How To Use The Twitter API To Find Events | ProgrammableWeb
- Nice little project to try, here: Create a mobile-friendly to-do list app with PHP, jQuery Mobile, and Google Tasks
- Creating Simple Responsive HTML5 and PHP Contact Form | Future Tutorials
- Setting up a Linux system? Then you want to read this: Most secure way to partition linux? – Information Security Stack Exchange
- Want to learn Linux? This is essential! IBM developerWorks : Technical library concerning Learning Linux
- If you are doing performance work on Unix, you will likely use vmstat. Even if you know vmstat, this is good to review: What to look for in vmstat – UNIX vmstat command
- Wow! OS/2 is still alive! OS/2: Blue Lion to be the next distro of the 28-year-old – Yahoo Finance
- Talk about old tech! This makes OS/2 seem fresh! It’s Insane that New York’s Subway Still Runs on This 80-Year-Old Switchboard | Motherboard
- I was doing some work on Netscaler and found this useful in comparing the set up of one Netscaler config with another: Export Netscaler Config – NetScaler Application Delivery – Discussions. This is also useful: Netscaler 9 Cheat Sheet.doc – netscaler9cheatsheet.pdf
- I thought this was a good development for everyone interested in Node: IBM Buys StrongLoop To Add Node.js API Development To Its Cloud Platform | TechCrunch
- Alot has changed with IBM’s OpenPOWER. Forbes gets you up to date, here: IBM’s OpenPOWER: A Lot Has Changed In Two Years – Forbes
- Cool stuff here: Access your Docker-based Raspberry Pi at home from the internet · Docker Pirates ARMed with explosive stuff
- I was using Perl scripts on Linux to send me messages to my mobile device via Pushover. This was good for that: pushover Archives – Perl Hacks
- I was also using WinSCP for that and this helped: Scripting and Task Automation :: WinSCP
- For all those trying to succeed in IT but feeling you are running into ceiling, you should read this: Tech’s Enduring Great-Man Myth or this When It Comes to Age Bias, Tech Companies Don’t Even Bother to Lie | Dan Lyons | LinkedIn
- Linus Torvalds is always interesting, and this is especially good: Linux at 25: Q&A With Linus Torvalds – IEEE Spectrum
- Very cool! Particle | Build your Internet of Things
- And finally some links to good stuff on UML online: Multi-layered web architecture UML package diagram example, web layer depends on business layer, which depends on data access layer and data transfer objects.
Posted in cool, IT
Tagged AI, IT, Linux, netscaler, OS2, performance, PHP, Prolog, Python, SoftLayer, twitter, WebSphere
A nice use of computing to refute an old conjecture from no less than Euler.
Found thanks to a tweet from
Posted in new!
Tagged Euler, math, twitter
If you haven’t heard, Meerkat and Periscope are two apps that allow one person to stream an event and have others watch it. For example, here is an artist streaming her work on a painting while others watch and interact: Wendy MacNaughton paints live on Periscope My… – Austin Kleon.
It’s an interesting idea. Once people get creative, there will be all types of events that people stream, from the obvious (porn, music concerts) to things no one thought of before.
I think one of these not so obvious ones will be virtual tourism. Essentially someone will visit a place like Japan and stream the cherry blossom festival or go to Pamplona for the running of the bulls and others will watch in real time. Maybe people will sponsor the person ahead of time, or the person will wear a shirt with ads on it, or find some way to make revenue. In return, lots of people can see something they might not be able to see otherwise.
People will use Periscope and Meerkat in all kinds of ways. Expect this to be one of them.
(Image via techcrunch)
Do you find it weird when you search for something, then go to other sites, and it seems like the product is following you around? Do you worry that sites are tracking information about you and you want to stop it?
I’d like to say there is an easy way to put an end to such tracking, but it doesn’t seem to be so. If anything, companies like Facebook, Google and others have a big financial interest in tracking you, regardless of what you think, and they are going to make it hard for you to put an end to it all.
That said, if you still want to take action, I recommend these links. They highlight tools you can use and steps you can take to limit tracking. You don’t have to be technical to read them, but you have to be comfortable making changes to your system.
- How to prevent Google from tracking you – CNET – this may be the best article that I read. Mostly focused on Google. There are useful links to tools in here and plugins you can use, like Disconnect and Ghostery. Somewhat technical.
- Facebook Is Tracking Your Every Move on the Web; Here’s How to Stop It – This Lifehacker article has more on how to deal with Facebook tracking you than Google, but it is also good.
- How to Stop Google, Facebook and Twitter From Tracking You – this piece from ReadWrite talks mostly about the Disconnect tool, but it does it in conjunction with discussion of some other tools. Seems less technical than the first two, if you found the first two links too hard to follow.
- How to Stop Google From Tracking You on the Web on NDTV Gadgets has tips that are more manual in nature, if you don’t want to download tools. Also some good information on how to deal with mobile phone tracking.
- Delete searches & browsing activity – Accounts Help via Google comes straight from the source of the tracking.
Some thoughts of my own:
- Consider using two browsers: one for your Google use (e.g. Chrome) and one for other uses (e.g. Firefox or Safari). The non-Google browser you can lock down with blockers and other tools, while the Google oriented browser could be limited to just what you need to integrate with Google.
- Avoid sites that track you, like Facebook.I know, it isn’t easy. If you have to go on Facebook — you get a call from a sibling asking why you haven’t commented on the new baby pictures there — limit yourself to a few thumbs up and leave it at that. (Knowing Facebook, they will still find a way to do something with even that data.)
- If you are really concerned, avoid Google altogether and use other search engines, like DuckDuckGo, and other email services, such as Outlook.com. There can still be tracking, but in theory this should make it harder.
- If you use any of tools, get into a habit of using them and keeping them up to date.
- Don’t forget to do the same thing on your mobile devices. Facebook can track your activity on your mobile phone, regardless of what you may be doing on the web. You can be tracked via apps just as easily as you can be tracked from your browser.
- If you do anything else, install the Disconnect plug in and then activate it and go to a newspaper site. You will be amazed just how much tracking is going on. (Also, you do NOT have to sign up for the premium version to get it working.)
Posted in advice, facebook, google, IT
Tagged advice, browsers, computers, Facebook, google, privacy, tracking, twitter
There was a lot of talk when Cory Archangel published the book above. Essentially it is a collection of tweets from others tweeting about…well, working on their novel! It’s clever, but it made me think that it is just the beginning of works of arts that could be mined from the colossal amount of tweets each day. There’s gold in there amongst all the twitter rage and minutiae about people’s day. It deserves better.
Meanwhile, more about that book, here: A Novel Compiled From Crowd Sourced Tweets About Writing A Novel | MAKE.
After the frustration with the Twitter service for changes like this, I thought I would give up Twitter. However, Twitter is the sum of a number of parts: there is the service that Twitter provides, from the backend servers to the APIs to the user interfaces and client software you use; and then there are the people that contribute to Twitter. Among those contributors are people I really enjoy socializing with whom I cannot connect with any other way. To give up all of Twitter means tossing out the baby, the bathwater and even the tub itself. That’s dumb. (I do dumb things often, but typically correct most of them in time. :))
To get around that, I decided to use my limited software skills and the APIs that Twitter provides to write my own Twitter client, in a way. It is a hack, but it is a good hack (for me). I am able to control what I see this way. Not only do I not have promoted tweets, etc., in my feed, but I am able to get rid of things like RTs from everyone, rather than having to turn of RTs one at a time. I’m also able to save all the tweets in a spreadsheet or some other format, so I can look at them when I am less busy, or decide on other filters I want to apply, etc. Later on I can write more filters so if a trending topic gets to be too much, I can just delete it or save it to a different file for later.
Now my Twitter experience is gone from poor to great (for me). I have thrown out the dirty bath water, but kept the tub and the baby. This makes more sense, obviously.
Last but not least, I appreciate all the people who expressed concern over my leaving Twitter. It was very kind of you, and why I want to stick around, if I can.