Bing and Big Think are hosting”Farsight 2011: Beyond the Search Box,” a four hour event today in San Francisco looking at the future of search. We’ve got various luminaries lined-up. Google will be on a panel. There’s that whole Google says Bing is copying them thing that just happened. Plus, Blekko’s here — and it just rolled out a new spam filter seemingly aimed at search enemy number one these days, content farms.
And here we go. Victoria Brown is welcoming us. She’s from Big Think, talking about how she talked with Stefan Weitz from Bing some time ago about doing this session.
Now Vivek Wadhwa is up. Talking about how search hasn’t really changed — you enter words and get links. The main change is, he says, the web has been overrun by spam. Does no one remember the spam glory days of Infoseek and AltaVista? I know some old fart SEOs do.
Vivek is talking about now wanting his computer to know about the type of restaurants he wants to go to, who his friends are. He’s hoping today we hear good idea and hear good news about how the web is going to change.
Now we get Peter Thiel doing a keynote. There are all types of interesting tech that can be built, woah, he just said Powerset that he invested in became Bing. Not. So. So not so. Anyway, beyond tech, let’s talk the cash in tech. So far, Microsoft still doesn’t seem to be making money. He estimates you need 35% marketshare to get to break even. Hmm. Hmm. I don’t know. Not an economics guy. But I’m pretty sure we had search engines profitable with lower share. Especially if you don’t mind shoving out ads to an unknowing audience.
Until fixed costs are solved, you have a monopoly. As investor, he’s only looking at companies that seem to have dealt with that (I think he said). As for Powerset, it ran into the fixed cost problems. They badly estimated that. Not to mention that Powerset wasn’t a general purpose search engine and was, to my understanding, a hell of a lot processor intensive to generate result that weren’t as useful for most searchers than regular search engines.
Long talk about economics, concluded by saying only Microsoft has the capital costs to compete with Google. Sorry, Blekko.
Now we’re getting the lead-up to the “Who Will Win The Spam Wars” panel, with Matt Cutts from Google, Harry Shum from Bing and Rich Skrenta from Blekko. This should be fun.
So Matt, what about the mess, Vivek asks, meaning spam. Matt’s skipping this and instead diving into the story about the Google says Bing is copying them thing.
So Matt asks Harry, what about this whole thing we found? Harry says interesting that Matt talked about all the improvements that Bing has, and you can’t discount all the work that Bing does.
But in his view, he thinks they’ve just discovered a new form of spam or click fraud. He wishes customers could take that to them, so they’d have time to study it. Hmm, sounds like he’s saying Google was doing something fraudulant.
Matt’s referring to a few outlier examples, he says. Bing looks at a lot of different signals. It’s not like we copy anything. We learn from the customers … from what type of queries they type.
Matt’s saying well no, those signals Bing uses don’t seem to just be showing up in synthetic tests.
Harry says you have to be careful. We learn from our customers. “Did you mean that Google owns the data, because the person uses the Google search engine.”
[You all getting this now? Bing probably won’t say they copy Google, because it seems to argue that it’s merely monitoring uses. Which, is also true, though the end result might be the same].
Matt went off on the Suggested Sites feature and I think says it’s not really clear what data it takes. Harry says Google well Google does the same. Matt says oh no we don’t and snaps his fingers. Except that was in his mind. He actually says categorically we don’t.
And Rich is either thinking whew I’m not being yelled out or when do I get to talk. Oh, he just got called on.
The web is full of garbage, he says. Now he’s gone into some numbers, which I lost track of, sorry, live blogging on tape is hard. That’s a Real Genius reference.
Basically, there’s all this stuff dumped on the web each hour, and it’s all designed to deceive search engines, because there’s value to being in those search engines.
Matt? He’s completely right. Likes they have SpamClock.com with 1 million pages per hour, and Matt’s first thought was only a million?
Vivek asks about Blekko’s banning of major spam sites. Matt says we prefer to have an algorithmic approach than manual action, on the whole.
Harry? Google is responsible for so much of the spam we see, why they appear (think he means because of AdSesne). Yes, why did they do that — Google ads. It’s easy to figure out the percentage of AdSense sites that are spam. You guys understand more of what going on and have a responsbility to share so we can tackle this problem jointly. Wow, we’re totally gloves off now.
Matt, many don’t realize if the spam team finds and ejects someone that kicks them out of AdSense. and if there were no AdSense, we’d still have spam.
Vivek, but Matt, what’s your incentive to clean all this up. Matt, we’ve always had the philosophy to do the best for users. Wouldn’t do pop-ups, for example.
Vivek, but how about content farms, and places like Demand help you earn. Matt comes back to saying you need an algorithm. Rich gets asked about algos, but didn’t catch all of what he said, sorry. Now Harry’s talking about Blekko says watches what they are doing. Well, technically watches what Blekko users are doing. C’mon, give me a laugh
Harry goes on, we really need a combination of manual intervention and algorithms. And one thing that needs to be looked at is the notion of authorship, do authors really have the authority to do well.
Vivek, what’s the difference between the three of you? How are you better than each other.
Shit, now I’m getting called out. Oh, it’s OK, it’s about one of my earlier article that Harry’s talking about, where I said we really need better metrics that everyone agrees on about measuring relevancy. Harry’s going to talk to me after the panel about this. And I shouldn’t worry about that big piece of wood he’s holding behind his back 🙂
Matt’s saying Google has a bunch of evaluators and that’s how they think about this stuff. They work hard to be fast. Spam’s an issue, but I think he’s saying it’s not as bad as some other issues.
Harry says they do have internal metrics, too. But he says we should all come together as an industry to come up with stats. Google as the leader has responsibility to do this.
Rich is talking! It’s not the algorithm and depending on that. It’s editorial decisions too. And I didn’t catch part of what he said. I’m really, really sorry. Have I mentioned having no sleep last night?
Harry’s talking again, and he really wants to get some defined numbers. He’s waiting for Matt to sing kumbaya on this with him.
Matt’s saying that you can try to prerank a bunch of queries and do other type of tests to figure out if things are working or not.
Rich is back, talking about how his slashtags allow for curating the best content, as well as the bad content — all assembled by humans, so why not leverage both of them, killing the spam.
Matt says he’s got a spam blocking tool that’s in testing to let individual users block individual sites they don’t like. This might come.
Where’s the web going. Didn’t catch some of what Matt says, but he says Google is being very transparent.
Harry says he agrees with Matt, search quality is everything. So far collectively, there’s a lot of good work. But human judgement is hard when library on West Coast can means something different than on East Coast. The goal is to go one step behind and figure out what’s in people’s minds and help them actually achieve it.
Rich, our vision is a shared curated web that is free of spam. Honest, that all came out, nice and clean, tweetable. Someone at Yahoo, hire Rich.
You have to use the Wikipedia model and crowd sourcing. The only way to clean up the web is to bring humans back.
Matt, love the idea there might be 20 more Blekkos. It proves with a small team, you could do amazing stuff.
Harry, what about dealing with costs, as Thiel talked about earlier. It’s just the kind of investment you have to do, cutting cost is one of his top things.
Rich, how do you do it? We’ve got $5 million in hardware, and you can do it. With $25 million overall, you can have a good web search engine. There should be more. We only have two engines in the market, so we should have a search. With search so valuable, why isn’t there more investment.
And that’s it. The three have gone off behind a curtain to start htting each other in public.
Jaron Lanier is up. I liked him at SXSW last year. Just saying. Talks about how Esther Dyson who is here once suggested a cost to email, but we want things friction free — but as a result, we have spam.
Now Jaron says he knows nothing about the whole Google copying Bing thing. He doesn’t think it was something willful, however. It’s just that as Bing looks at what people do, Google’s content leaks out. That’s kind of reasonable.
Going on, he says it’s kind of like with YouTube saying to the content producer that you need to produce it. Not to be partisan, it does seem to him that Google’s complaining about the same system that it benefits from.
Jaron thinks you need to have friction, as Esther has suggested. Said some things I didn’t catch, sorry, needed to multitask poorly for a second. And Jaron says he had slides that aren’t appearing — so see, that would have helped me, your poor live blogger.
But we need to move to a prior organization of data, things like Facebook where you have an enforced identity.
Jaron says he’d happily pay 1 penny per search, if it was easy.
Esther is up is now, talking about how Bill Gates told her at dinner that the future of search is about verbs.
In the past, search was about nouns. Hey, I like this I have a whole Google is a noun, Foursquare is a place, Facebook is a person, they’re all nouns thing. But no, it doesn’t keep doing in that direction.
The new form of search is about verbs. Not just the location but how do I get there.
So is the future of search an entire new market. If Google gets ITA, Bing with Bing Travel … can make the argument that as the web is going app like, search engines are a set of apps.
OK, Di-Ann Eisnor from Waze is up. She was sitting next to me a minute ago and is very nice. They gather paths that people make, as they travel. They provide real time GPS info based on crowdsourcing. Where should I go. They’ll give the ETA right now. they watch for tweets, things like hazards.
Enter your destination, enter your ETA and they give you options. The have prizes in places where they don’t have info, to get you to go there and get points. Hey, Foursquare meets GPS.
Lots of stats and demo of the service.
Here’s Blaise Aguera y Arcas, a shining star at Bing. Talking about moible, old view was a dumbed down web, 1.6 versus double that searches on desktop. None of this is true now — mobile is much more desktop like.
Average voice query length is 3.5, almost like on a PC. Plus you have phone sensors, like camera, GPS that can help.
Lots more stats and observations.
Search 1.0, looking for nouns, browser-based, read-only, lots of things — I’ll add a screenshot in a bit. But Search 2.0 breaks all these things.
I’ve had to go out for a bit, so I’ve missed some of the panels. Jumping back into the individual presentations.
Gary Small, Director of UCLA’s Center on Aging, is up. He’s talking about that whole is Google making us stupid thing. His team tried to discover how search is affecting our brains. So he needed poeple who’d never been online as a baseline – these were pretty much older people. Then they matched them to similar older people (62ish) who were net savvy’
Put them in MRI machines, showed them search pages to review. The “net naive” had same images when dealing with search or text. But with savvy people, your brain got active twice as much when doing searches. Not twice as smart, as some reported, just more activity.
Gord Hotchkiss actually did a big article about this for Search Engine Land, and how brains can be trained in relation to search. See it here Is Google Rewiring Our Brains?
So is tech weaking our memory? We have to pick and choose what we store in our brains.
Surgeons who play video games make fewer errors.
In future, he predicts we’ll have headbands or earpieces, you’ll think to your computer, which in turn might talk to a friend’s computer and then into their heads.
Computer mindreaders. CMU is developing experiments where computers read thougths. Really? Wow, I have to look that up.
In the end, not hurting our memory but changing how we use it — and an opportunity to shape how we use it.’
Ilya Segalovich, cofounder and CTO of Yandex, who says he was asked to talk about why Yandex is so popular in Russia. Yes, Russia is one of the few places where Google isn’t number one.
In 1991, there was no good coverage of Russian web pages. They’ve grown over time, closing on 70% share of the market. Covers important vertical search markets, too.
Search quality — think its important to have internal and external metrics (there’s a company that tries to measure externally). When made new algorithm a year ago, had sharp rise of marketshare.
Strong focus on local audience. 77 regions in Russia; 5 countries that use Cyrillic alphabet.
What could Google steal from this, is a question. Good shopping, bilingual search are some things he tosses out.
And now, my favorite TV & movie search engine, Clicker.com. Paul Wehrley, cofounder and COO, is up.
They only list where legal content is located on the web; they know seasons, high quality meta data, tons more good stuff.
Why start it? A few years ago, there was enough core data online that could be attacked. Knew they needed to come up with a structured way to get to the content.
For Clicker, relevancy means getting rid of “noisy” content and links to illegal content.
Among the solutions is user data, watching what people do — they have a new partnership with Facebook to personalize results (it’s kind of neat).
Q&A panel now, and I’ve just been listening, for once. But an interesting nugget from Segalovich who wants to see search history portable. And by history, he means all the things you’ve clicked on and searched on should be transferable from one search engine to another.
That brings him back to today’s Google-Bing dispute. Looking at it this way, he says it’s not Google’s data that’s somehow being taken but the user’s data that they want to take.
I thought Google did allow for search history to be exported. However, looking at Google’s Data Liberation Front site, I don’t see search listed.
And with that, the event concludes.