≡
Ø
«
»
0 : Searching Drupal
1 : Searching Drupal
2 : But first.. a story
3 : A story, continued
4 : A story, continued
5 : A story, continued
6 : A story, continued
7 : A story, continued
8 : Two parts of search
9 : Internal Search
10 : Internal Search - Tip
11 : Internal Search - Accuracy
12 : Internal Search - Configuration
13 : Internal Search - Issues
14 : Internal Search - Step Forward
15 : Internal Search - Target.com
16 : Internal Search - Target.com
17 : Internal Search - Solution 1
18 : Internal Search - Larger Problems
19 : Internal Search - Larger Problems
20 : Internal Search - Solution 2
21 : Internal Search - Solution 3
22 : Internal Search - ApacheSolr
23 : External Search
24 : External Search - Yay Drupal
25 : External Search - Analytics
26 : Summary
27 : TTFN
Searching Drupal
Damien McKenna
Mc-Kenna.com
&
Bonnier Corporation
Twitter:
DamienMcKenna
Searching Drupal
Taken for granted
Assumption that "it'll just work"
But first.. a story
Hired by Bonnier for 3 month Drupal 5 project
Short development cycle
Made assumptions
Made compromises
A story, continued
One major assumption...
"Search will work good enough"
"Tweak later"
Put another way...
"Search will work"
A story, continued
Launched site
Seemed OK, could find results
Complaints of search missing content
57,000 nodes - articles, images, etc
7,000 nodes indexed
A story, continued
Dug around, asked around
Mike Anello found it..
A story, continued
D5 search engine indexing flawed
Indexing tracks last timestamp, last nid, last comment timestamp...
Kludgy
If data converted, strong chance of missing some
Out of 57,000 nodes..
Only indexed about
5,500
!
A story, continued
Dug further
Solution...
Use Drupal 6's engine
!
Track each node individually
Recommended for all D5 sites!
Two parts of search
Internal
Search when already on the site
External
Search from outside
Google, etc
Good amount of overlap
Internal Search
Logical content hierarchy
Each item element given different weight
Title
field most important
Then body structure - h1, h2, h3, etc
Internal Search - Tip
Put key words in
Title
field
SkiNet's Gear Finder
Title
field has ski model name
Word "ski" nowhere to be found
Search for "k2 skis" - no results
Should be: "[make] [model] ski"
e.g. "K2 Apache Recon ski"
Internal Search - Accuracy
"ski" vs "skis" vs "skiing"
Porter Stemmer module
Breaks search terms down to root form
e.g. "skis" becomes "ski"
Internal Search - Configuration
Standard search configuration
Taxonomy weighting
Search Config module
Limit indexing:
Content type
Taxonomy
Works pretty well
Internal Search - Issues
Limited control on search
All words handled the same
Can't limit based on specific fields
Internal Search - Step Forward
Faceted Classification
Each content type field selectable
e.g. product color, book publication date, etc
Becoming defacto standard..
Internal Search - Target.com
Internal Search - Target.com
Internal Search - Solution 1
Faceted Search module
Supports CCK fields
.. Date
.. Taxonomy
.. and more
Best solution for small sites
Internal Search - Larger Problems
Internal Search - Larger Problems
Faceted Search module very database intensive
Very slow
Solution:
Separate search to external system
Lots of options...
Google CSE
Sphinx
Internal Search - Solution 2
LuceneAPI module
Engine written in PHP
Requires Zend Framework
Advanced syntax, facets.. lots of good stuff
Good solution for medium-sized sites
May not be suitable for shared hosts, but perfect for going beyond the core search engine without getting involved with Java engines.
Internal Search - Solution 3
Apache Solr module
Dries uses it!
Acquia uses it!
Drupal.org uses it!
Bonnier uses it! :-)
Internal Search - ApacheSolr
Lucene in Java
Separate to another server
Use same infrastructure with other sites
Keep Java developers employed ;-)
Facets, sorting, related content block, multi-site (soon)..
Best solution for large sites
External Search
Google, Yahoo, etc
SEO is king
Standard practices
SEO Checklist
External Search - Yay Drupal
Friendly URLs by default
PathAuto module
- automate URL gen
MetaTags/Nodewords module
- keywords, desc
XMLSiteMap module
- notify engines
SiteMap module
- automated site map
External Search - Analytics
Google Analytics module
Omniture module
Quantcast
Summary
Drupal 6 engine
Title is king
Starting point:
LuceneAPI module
Smaller?
Faceted Search module
Larger?
Apache Solr module
SEO Checklist
Drupal -
yay
!
TTFN