Solr

There is a lot to learn about Solr, you’ll be joining me on that journey as you peruse this page. To start I’m using the Solr Reference Guide 8.5 and creating a survey of the topics I intend to learn.

This is essentially a copy of the table of contents of the reference guide atm, though I’ve made a few tweaks here and there for my own purposes.

Getting started

  • Overview
  • Installation

Deployment and Operations

  • Control Script
  • Config Files
  • Backups
  • Running on HDFS
  • Running on AWS EC2
  • Upgrading a Cluster

Administration User Interface

  • Overview
  • Logging
  • Cloud Screens
  • Collections / Core Admin
  • Java Properties
  • Thread Dump
  • Suggestions Screen
  • Collection-Specific Tools
    • Analysis
    • Dataimport
    • Documents
    • Files
    • Query
    • Stream
    • Schema Browser
  • Core-Specific Tools
    • Ping
    • Plugins & Stats
    • Replication
    • Segments

Documents, Fields, and Schema Design

  • Overview
  • Field Types
  • Defining Fields
  • Copying Fields
  • Dynamic Fields
  • Other Schema Elements
  • Schema API
  • DocValues
  • Schemaless Mode

Analyzers, Tokenizers, and Filters

  • Overview
  • Analyzers
  • Tokenizers
  • Filters
    • Filter Descriptions
    • CharFilterFactories
  • Language Analysis
  • Phonetic Matching
  • Running Analyzer

Indexing and Basic Data Operations

  • Overview
  • Post Tool
  • Uploading Data with Index Handlers
  • Indexing Nested Child Documents
  • Uploading Data with Solr Cell using Apache Tika
  • Uploading Structure Data Store Data with the Data Import Handler
  • Updating Parts of Documents
  • Detecting Languages During Indexing
  • De-Duplication
  • Content Streams
  • Reindexing

Searching

  • Overview
  • Velocity Search UI
  • Relevance
  • Query Syntax and Parsing
    • Command Query Parameters
    • Standard Query Parser
    • DisMax Query Parser
    • Extended DisMax (eDisMax) Query Parser
    • Function Queries
    • Local Parameters in Queries
    • Other Parses
  • JSON Request API
  • JSON Facet API
  • Faceting
    • BlockJoin Faceting
  • Highlighting
  • Spell Checking
  • Query Re-Ranking
  • Transforming Result Documents
  • Searching Nested Child Documents
  • Suggester
  • MoreLikeThis
  • Pagination of Results
  • Collapse and Expand Results
  • Result Grouping
  • Result Clustering
  • Spatial Search
  • The Terms Component
  • The Term Vector Component
  • The Stats Component
  • The Query Elevation Component
  • The Tagger Handler
  • Response Writers
    • Velocity Response Writer
  • Near Real Time Searching
  • RealTime Get
  • Exporting Result Sets
  • Parallel SQL Interface
    • Solr JDBC
      • DbVisualizer
      • SQuirrel SQL
      • Apache Zeppelin
      • Python/Jython
      • R
  • Analytics Component
    • Expression Sources
    • Mapping Functions
    • Reduction Functions

Streaming Expressions

  • Stream Source Reference
  • Stream Decorator Reference
  • Stream Evaluator Reference
  • Math Expressions
    • Scalar
    • Vector
    • Variables
    • Matrices and Metric
    • Streams and Vectorization
    • Text Analysis and Term Vectors
    • Statistics
    • Probability Distributions
    • Monte Carlo Simulations
    • Time Series
    • Linear Regression
    • Interpolation, Derivatives and Integrals
    • Curve Fitting
    • Digital Signal Processing
    • Machine Learning
    • Computational Geometry
  • Graph Traversal

SolrCloud

  • How It Works
    • Shards and Indexing Data
    • Distributed Requests
    • Aliases
  • Resilience
    • Recoveries and Write Tolerance
    • Query Routing and Read Tolerance
  • Configuration and Parameters
    • Setting Up an External ZooKeeper Ensemble
    • Using ZooKeeper to Manage Config Files
    • Collections API
      • Cluster and Node Management
      • Collection Management
      • Collection Aliasing
      • Shard Management
      • Replica Management
    • Parameter Reference
    • Command Line Utilities
    • Legacy Config Files
    • Configsets API
  • Rule-Based Replica Placement
  • Cross Data Center Replication (CDCR)
    • Architecture
    • Configuration
    • Operations
    • API
  • Autoscaling
    • Policy and Preferences
    • Triggers
    • Trigger Actions
    • Listeners
    • Automatically Adding Replicas
    • Fault Tolerance
    • API
    • Migrating Rule-Based Replica Rules to Autoscaling Policies
  • Colocating Collections

Other

  • Legacy Scaling and Distribution
  • Solr Plugins
    • Lib Directories and Directives
    • Package Management
    • Adding Custom Plugins in SolrCloud Mode

Configuration

  • Configuring solrconfig.xml
    • DataDir and DirectoryFactory
    • Schema Factory Definition
    • IndexConfig
    • RequestHandlers and SearchComponents
    • InitParams
    • UpdateHandlers
    • Query Settings
    • RequestDispatcher
    • Update Request Processors
    • Codec Factory
  • Solr Cores and solr.xml
  • Resource Loading
  • Configuration APIs
    • Blob Store
    • Config
    • Request Parameters
    • Managed Resources
  • Implicit RequestHandlers
  • JVM Settings
  • v2 API

Monitoring

  • Metrics Reporting
  • Metrics History
  • MBean Request Handler
  • Configuring Logging
  • Using JMX
  • Using Prometheus and Grafana
  • Performance Statistics Reference
  • Distributed Tracing

Security

  • Configuring Authentication, Authorization and Audit Logging
  • Enabling SSL
  • Audit Logging
  • ZooKeeper Access Control

Client APIs

  • Overview
  • Choosing Output Format
  • SolrJ
  • JavaScript
  • Python
  • Ruby
  • Other Clients

Solr Glossary

Tools

  • Blacklight – An open source, Ruby on Rails front-end for Solr.