|
Meeting order
prices and details
Meeting
order form
Printable Meeting
order form
Exhibitor briefing 
Hotel order site 
Meeting order
prices and details
Meeting
order form
Printable Meeting
order form
Exhibitor briefing
Hotel order site 
|
|
This page last changed 01 December 2008
Boston, Massachusetts, April 27-28, 2009
Program
This annual meeting provides a forum and point-of-reference
for all those interested in the intricacies of Search and Retrieval.
The meeting draws those with a professional interest in search
engines -- such as search engine designers and developers --
and those interested in applying search engines in their own
professional environments. Search is at the heart of information
retrieval; and the Search Engine Meeting provides an annual point
of reference as to what is happening in this fast-moving and
exciting field.
All presentations are given
sequentially; there are no parallel sessions or parallel presentations
at this meeting.
Sunday April 26
Pre-Conference Tutorial
Sunday Afternoon (Stephen Arnold)
Monday April 27
THE PAPERS BELOW ARE CURRENTLY NOT IN
FINAL PRESENTATION ORDER
Day One Opening Keynote
[Speaker to be announced]
Dmitri Soubbotin
Semantic Engines, New York
The Variety of Goals and Applications of Semantic Approach
to Search
This presentation compares different approaches to presenting
search results to users. Various types of search queries have
been identified based on the user intent. Accordingly, different
types of results are identified: a conventional list of links;
a hierarchy or a cluster of concepts with underlying links; a
direct answer. A multi-document summary of Web sources is introduced
as a legitimate type of search result on the example of SenseBot.
"Semantic cloud" of key concepts is suggested as a
means of controlling the focus of the summary. The idea is to
give the user a quick answer fast, obviating the need to drill
down into the sources in many cases.
Semantic analysis is discussed as a way to augment traditional
search with its page ranking system. Examples of intelligent
applications based on the approach are presented. Intersection
between semantic search engines and Semantic Web is discussed
as a mutually beneficial opportunity. Two major challenges facing
semantic systems are the ambiguity of natural languages, and
high infrastructure requirements. Some ways to deal with these
challenges are discussed.
Diane Burley
Nstein, Canada
A Pragmatic Look at the Semantic Web
Research shows that there are two types of site searchers: those
who rely on the search bar and those who are link-dominant, who,
like a spider, crawl from page to page using inline links
if those links exist. The challenge with the search bar is that
unless the reader types in exactly the words that the journalist
used, the story will go undiscovered. A story on great
stuffings for your holiday bird may not appear if the reader
happens to type in the word dressing and the story
was not tagged properly. Move beyond the realm of synonyms to
denotations and connotations and thus is the semantic web
a world filled with literal and figurative associations that
could help the readers find what they are looking for regardless
if they know they are looking for it." Tagging
is the simple answer, while rich metadata are the crux to the
semantic web.
The advancement of the semantic web is a transformative
time for news sites. If simple tagging seems onerous how is it
possible that we could consistently and comprehensively semantically
and more importantly, semantically associate
assets be they article, image, motion or audio? The
answer is multifaceted. In this presentation we take a rudimentary
look at the components of the semantic web: tagging, taxonomies,
authority files, knowledge bases, and look at some of the tools
that will help you automatically tag and associate. Further,
how can we expose these rich metadata to better create a reader
experience? Indeed, how can we expose these metadata on the back
end so that we editors can research or package news with greater
ease? Does automation obviate the need for mediation? Just how
do editors thrive in the semantic web?
Frank Bandach
Eeggi, California
Semantic Coherence and a New Search Paradigm
This presentation discusses the engineering of an indexing-numeric
language for the manipulation of semantics, grammar, concept
novelty, responsiveness, disambiguation, translation, and its
evolution into basic rationality towards a new search engine
paradigm.
Kathleen Dahlgren
Cognition Technologies, California
The Puzzle of Semantic Technologies
Semantics is now center stage in search, with various approaches
having been proposed. Most current approaches to Web 3.0, or
the Semantic Web, primarily tag pages in a tagging language.
Others use ontology, so that users can query "car"
and see retrievals with "SUV" or "Porsche",
or they present users with summaries or pull-downs based on ontology.
Still other semantic approaches focus on syntax parsing in order
to recover the formal semantics or argument structure of text
and query. Another additive approach to semantics is the building
of a Semantic Map. A Semantic Map contains word-level and contextual
information that enables a search engine to do complete word
sense disambiguation, or understanding, at the word level. Our
goal should be a complete approach that treats all aspects of
semantics, including sense disambiguation, ontology, synonymy,
commonsense knowledge, aspect, information to assist in pronoun
reference and discourse reasoning and any other information required
to replicate full lexical and formal semantic reasoning.
Martin Baumgärtel
bioRASI, California
Advanced Visualization of Search Results: More Risks, or More
Chances?
Many products have been deployed and numerous articles have
been published about breakthroughs in the visualization of search
results. Yet, is search result visualization common practice
in everyday information retrieval tasks? This presentation addresses
the gap. Results from case studies and from the analysis of human-computer-interaction
are presented. Direct user feedback from the visualization of
semantic relations is summarized and a general theory concluded.
Whether you work on visualization or have investigated visualization
technologies/designs to improve the search experience in your
environment, this presentation will give you valuable advice,
help in maintaining a realistic view and methods to prevent common
pitfalls.
Panel: Non-Text Search Technologies: Speech, Images, Video
Chaired by Susan Feldman
(IDC)
Speakers to include:
Thomas Wilde (Yahoo,
Massachusetts): How Video Gets Found: changing consumer search
strategies for audio and video online and implications for content
producers
Stephen E. Arnold
AIT, Kentucky
Google Looks Beyond the Laundry List
This presentation presents three of the technologies that
are shifting Google from a service which requires the user to
enter a query, to a service that presents search within a user's
context. Each of these technologies is in use in various Google
services. The combination of Google's existing and better-known
search methods are complemented by functions that operate automatically
or semi-automatically to improve the user experience. First,
Google's Chrome is a way for Google to connect the user to Google
services and Google services to the user. One key component in
Chrome is its ability to track a user's behavior, perform predictive
analyses, and give the user access to containers or virtual machines.
Chrome is not an operating system; Chrome is a connectivity mechanism
that operates regardless of the user's computing operating system
or device. Second, Google's janitor technology allows the company
to "clean up" structured and unstructured information.
One way to use the cleaned up data is to produce an automatic
dossier about a person, place or thing. Third, Google's dataspace
technology provides an environment in which Google can generate
new types of metadata about information processed by Google's
indexing system.
Google has not issued public information about these
innovations, but each is disclosed in open source documents such
as technical papers, patent documents, and public presentations
by Google professionals. The conclusion drawn from this review
of three interesting Google innovations from the 2007-08 period
is that the company is shifting from key word queries to search-enabled
applications. These applications present the user with solutions
to information problems, not a laundry list of results.
Francisco Corella and Karen
Lewison
Pomcor, Oregon
Searching the Web More Effectively with Multiple Simultaneous
Queries
We describe a Web search facility that reduces the time and
effort that it takes the user to home in on the desired results
for difficult search problems. When the user enters a query the
search facility anticipates possible follow-up queries, issues
them immediately, and allows the user to browse the search results
of the original query and these additional queries simultaneously.
Additional queries may include a respelling of the original query,
related queries, and/or sub-queries. (By sub-query we mean a
query consisting of a subset of the search terms of the original
query.) We describe a parallel algorithm that efficiently produces
an optimal set of sub-queries and their results in the important
special case where the original query has zero results; although
it is rare for a query that targets the Web at large to have
no results, the zero-result case is important for queries that
target a particular site.
We have built a prototype of such a search facility
as a client-side script, implemented on the Adobe Flex platform,
and thus running on the Flash plug-in, that accesses the Yahoo
search engine via the Yahoo Astra Web APIs library. The Yahoo
search engine has not been modified for this purpose, so our
innovations are implemented entirely on the client side. We point
out, however, that it would be beneficial to transfer parts of
the implementation to the server side, and explain how this could
be done. The prototype only handles purely conjunctive queries,
but we also describe a method for handling general Boolean queries,
and we describe an extension of the parallel zero-result algorithm
to the general case.
Marguerite Leenhardt
Université Paris 3
Sorbonne nouvelle, CLA2T/SYLED, France
A Study of Evaluative Language in SMS Messages: Towards a Characterization
of Opinion
At the moment, the results of tools for analysing information
exchange have a significant commercial value. This current study
is a textual and linguistic evaluation of a corpus of text messages
sent by mobile phone. The approach used aims to bring distributional
characteristics under different levels of description language
with the aim of modeling the linguistic content of the knowledge
contained in the corpus. The aim is to contribute to the characterization
of the evaluative language in the SMS. In perspective, we try
to put some markers on industrial applications of the analysis
of such textual content, especially in relation to marketing
applications.
We support the idea that the subjective knowledge
gained on large body of messages can be used for automated analysis
of the views contained in brief texts published on the web, such
as messages posted on Twitter.
Tuesday April 28
Day Two Opening Keynote
David A. Evans
JustSystems Evans Reseach, Pennsylvania
E-Discovery: A Signature Challenge for
Search
Corporations
increasingly use and retain information only in the form of electronically
held data and documents. As a result, the production and sharing of
information in legal proceedings throughout the U.S. will depend heavily
on techniques for accessing, searching, organizing and analyzing
electronic data -- the principal focus of E-Discovery. Large corporations
may have terabytes of e-mail and other files spanning many years that are
potentially relevant to a case. In response to a court order, an
E-Discovery team must identify, assemble, individuate and categorize an
organization's files, segregate all "privileged" material (which
may be withheld legally), and deliver a minimally comprehensive and
exhaustive set of data to the opposing party -- all in a relatively short
amount of time. The techniques needed to accomplish such a task
necessarily include search, clustering, classification, filtering, social
network analysis, extraction, and more -- and no one of these is
sufficient. Such requirements challenge our traditional models for search.
In particular, the appropriate user models do not reflect the standard
"web" or "enterprise" conditions. This presentation
explicates the requirements and types of solutions that dominate E-D.
David Milward
Linguamatics, Cambridge, UK
Accessible Knowledge Discovery Using Agile Natural Language-Based
Text Mining
This presentation reviews the challenges faced by the pharmaceutical
industry and other knowledge-intensive industries in answering
business-critical questions using diverse text resources. It
discusses a selection of case studies where an NLP-based approach
for discovering relevant facts and relationships from unstructured
text is delivering significant value - both in terms of improved
productivity and in discovering new knowledge by combining information
extracted from different sources.
Brian J Buck
RiverGlass, Illinois
The Science of Search: How the Enterprise Intelligence Cycle
(EIC) is Critical to Success
For the knowledge professional, there is more to search
than simply looking up a specific piece of information; t is
about information discovery, relevance assessment, analytical
summary and reporting in support of the entire intelligence-gathering
process.
The Enterprise Intelligence Cycle (EIC) is one of
the most important information processes within an organization.
It encompasses everything from the identification of critical
information needs, the entire process of information seeking
and collection, the incorporation of new information into organizational
and individual knowledge models, and the application of human
judgment to create the essential intelligence for timely business
decision making. Recognizing and optimizing the EIC will be a
prerequisite for success in a world increasingly overloaded with
the volume, variety, and velocity of unstructured data
as will the adoption on search techniques that truly take into
account user context and purpose.
Miles Kehoe and Mark
Bennett
New Idea Engineering, California
Search Security Issues for the Enterprise
Enterprise search must factor into account access control and
privacy issues, in particular sensitive documents need to be
searchable so that they can be shared with the appropriate audience,
but not visible to everyone behind the firewall. For example:
nobody wants 401K account summaries to display except after appropriate
access has been granted, corporate strategy documents relating
to outsourcing or layoff plans should not be viewable by all
until they are announced. Security must be handled at the document,
sub-document and sub-field levels. Here are our best practices
for these leading search engines. And here are actual "gotchas"
that we have seem at consumer sites and that you can learn from.
Sid Probstein
Attivio, Massachusetts
Intelligent Integration: Combining Search and BI Capabilities
for Unified Information Access
Enterprise search technologies are efficient in filtering
unstructured content such as emails and documents. While corporate
reports and dashboards display transactional database information,
there is a disconnect between technologies. How do you integrate
these two sources of data to make them more useful? What
about important content that exists outside your organization? There
is a new generation of innovative technologies that enable the
integration of unstructured content with structured data, bringing
together enterprise search with business intelligence capabilities.
By enabling automatic updates and alerts in real time, these
technologies can affect business processes when it matters: at
the convergence of business decisions and actions. For example,
drug companies could be alerted whenever a product is mentioned
in connection with any terms implying adverse effects.
Consumer goods companies could search blogs for comments on their
products by tying their structured product catalog with the ability
to analyze unstructured content and apply sentiment analysis.
This presentation provides real-world examples and explores
new tools that combine enterprise search with business intelligence
capabilities that provide faster time to value by:
- Enabling decisions based on an intuitive complete view of
your information landscape: both structured and unstructured
content including databases, web pages, office documents, email,
and media files.
- Providing a single repository: eliminating the need for jumping
from application to application based on the type of question,
or format of the information you examine.
- Enabling users to use a simple search interface to access
all your information assets, rather than learning complex BI
applications to access structured data.
- Offering comprehensive connectivity and language support,
easy installation and being linearly scalable.
- Integrating data with key business processes in real-time
to affect enterprise-wide processes.
David Seuss
Northern Light, Massachusetts
Using Text Analytics for the Automated Analysis and Discovery
of Meaning From Large Stores of Market Intelligence
There has been much recent coverage at places like the Search
Engine Meeting on using text analytics for reputation management
in brand metric tracking applications, but how do you create
systems that assist in business analysis, strategic research
and competitive intelligence from volumes of news and market
research reports? For example, rather than merely tracking
mentions of your brands and measuring the sentiment toward them,
you could find out which technologies your competitors are working
on, uncover where your competitors are using pricing to gain
market share, and identify what product marketing tactics are
being employed in your target markets. Text analytics can greatly
assist in this process, but using text analytics for strategic
research is different from using it for reputation management,
and requires completely different solutions. This presentation
describes the opportunities and challenges in creating systems
for the automated analysis and discovery of strategic meaning
from market intelligence content. It describes what it
takes to create such systems and outlines the pitfalls needed
to be avoided in developing and deploying them.
Jeff Catlin
Lexalytics, Massachusetts
Taking Search to the Next Stage with the Power of Text Analytics
Enterprise search is still growing, evolving and enhancing,
and its main purpose continues to be to help users find the answers
to their business questions hidden in a complex myriad of sources.
But many people are beginning to ask "what's next?"
for the enterprise search industry. The answer: combining
search technologies with the fast-evolving area of entity extraction
and sentiment analysis. Extracting important metadata and
providing insight to the sentiment of those data compliments
enterprise search by helping the user uncover the questions they
may not think to ask. In fact, those familiar with text analytics
would argue that enterprise search is more important than ever
to maintain a competitive edge, and that text analytics will
play an increasingly large part in that equation.
Christian Reuschling and Andreas
Dengel
German Research Center for Artificial Intelligence, Germany
DynaQ - Dynamic Queries for Document-Based, Personal Information
Spaces
The paradigms of common, keyword-based document search engines
are often not sufficient for the natural searching attitude of
human beings. In most systems, the only possibility for searching
is to formulate a query from scratch and obtain the results.
If we have not found what we were looking for, we usually
have to start again, reformulating our query. While it is hard
for humans to explain an entity completely, it is easier for
us to 'navigate' through the document space step-wise, to have
an overview of the current state of the search, and, having several
tools at hand to support us, to refine the initial query.
DFKI has developed an inquiry system called 'DynaQ'. Its
aim is to enable searchers to explore their personal information
space, supporting them with this step-wise searching paradigm
called 'orienteering'. For that, the system offers several tools
in order to fulfill the Visual Information-Seeking Mantra "overview,
zoom & filter, details-on-demand". Some key features
of DynaQ will be demonstrated:
- Birds eye view of the result list
- Dynamic query sliders allowing search terms to be weighted,
thus dynamically re-sorting the results on-the-fly
- Thumbnail generation for indexed documents (pdf, office,
etc)
- Relevance feedback: Queries can be contextualized by marking
one or more documents as relevant. Documents that are similar
to them will be ranked higher in the result list. Users can choose
between two kinds of similarity:
- Textual content similarity
- Image similarity (for text represented in bitmaps or for
image files)
- Push search dialogue showing details and related documents
according to the attribute similarity (e.g., same author, similar
full text)
- Indexing of all common file formats (e.g., pdf, MS
Office, rtf, gif, jpeg) and Emails)
- Availability of the complete Wikipedia index.
Daniel Tunkelang
Endeca, Massachusetts
Enabling the Information Seeking Process
In the early days of information science, the process of finding
information was conceived as precisely that: a process. But the
success of commercial search engines had the unfortunate side
effect of reducing this process to a guessing game of relevance.
Given that search engines are not mind readers and cannot reliably
infer a person's intent from a two-word query, we need to remember
that information seeking calls for a process, a dialogue between
the user and the system.
This presentation outlines the principles of information
seeking as a dialogue and walks though concrete examples that
illustrate the principles of human-computer information retrieval (HCIR), a vision that is reshaping approaches to information
access. Specifically, the presentation shows how designing an
application in terms of bi-directional communication between
the user and the system addresses the inherent limitations of
conventional search engine approaches.
Peter Noerr
MuseGlobal, California
The Underground Information Ecosystem: Connectors
How these vital, but fragile, items allow different systems
to connect and interact, and how they need to be maintained and
supported for the information to flow. Like plumbing, they stay
out of sight, but are critical for system integration and services,
such as federated searching, content harvesting, semantic mapping,
and any activity which requires information from more than one source.
Presentation of the Everett
Brenner Award for the Best Paper at the 2009 Search Engine Meeting Meeting Wrap-up Panel: What we Liked. What we
Learned
Two expert industry commentators reflect on what was
said during the two days of the 2009 Search Engine Meeting and,
with the help of the audience, draw some lessons and conclusions.
Conference Ends at approximately 4.30
pm
|