Skip to content
Snippets Groups Projects
Commit fc7cc571 authored by ghaughian's avatar ghaughian
Browse files

[solr] adding support for Apache Solr

updating readme

updating package info

perfecting logic for http solr clients for all operations

renamed properties, tested cloud mode and cleaned code

removed dependency on dynamic field names, updated readme

now enforcing checkstyle

adding solr artifact

removing test cases relying on external dependencies

removed unused maven dependencies, added batch mode support, all try blocks now catch eplicit exceptions, Query/UpdateResponse status codes are handled more granularly, updated readme, added sample schema.xml file to support default field names in ycsb client, updated all license headers to 2016, using SolrClient object as primary client type regardless if Solr is running in Cloud or Stand-alone mode

cleaned code and config files, now accepting a solr base url property, simplified sample schema.xml file, renamed class to SolrClient, now updating documents atomically, added batch support to delete method

updated new line spacing of pom file comments

removed sample schema file, updated readme with more indepth explanation on running/setting up the solr-binding

removed some code lines no longer in use

renamed zookeeper param name, now throwing caught exceptions where appropriate, debug messages are now being logged on stderr

now returning an appropriate error if we receive an unexpected response from solr server, repeated calls to getResults is no longer

now using singletonMap to store update params in, fixed typo and missing id field in sample config in README
parent 34420257
No related branches found
No related tags found
No related merge requests found
......@@ -78,6 +78,7 @@ DATABASES = {
"orientdb" : "com.yahoo.ycsb.db.OrientDBClient",
"redis" : "com.yahoo.ycsb.db.RedisClient",
"s3" : "com.yahoo.ycsb.db.S3Client",
"solr" : "com.yahoo.ycsb.db.SolrClient",
"tarantool" : "com.yahoo.ycsb.db.TarantoolClient",
"voldemort" : "com.yahoo.ycsb.db.VoldemortClient"
}
......
......@@ -84,6 +84,9 @@ public class Status {
public static final Status NOT_FOUND = new Status("NOT_FOUND", "The requested record was not found.");
public static final Status NOT_IMPLEMENTED = new Status("NOT_IMPLEMENTED", "The operation is not implemented for the current binding.");
public static final Status UNEXPECTED_STATE = new Status("UNEXPECTED_STATE", "The operation reported success, but the result was not as expected.");
public static final Status BAD_REQUEST = new Status("BAD_REQUEST", "The request was not valid.");
public static final Status FORBIDDEN = new Status("FORBIDDEN", "The operation is forbidden.");
public static final Status SERVICE_UNAVAILABLE = new Status("SERVICE_UNAVAILABLE", "Dependant service for the current binding is not available.");
}
......@@ -139,6 +139,11 @@ LICENSE file.
<artifactId>s3-binding</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>solr-binding</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>tarantool-binding</artifactId>
......
......@@ -93,6 +93,7 @@ LICENSE file.
<couchbase.version>1.1.8</couchbase.version>
<tarantool.version>1.6.5</tarantool.version>
<aerospike.version>3.1.2</aerospike.version>
<solr.version>5.4.0</solr.version>
</properties>
<modules>
......@@ -124,6 +125,7 @@ LICENSE file.
<module>orientdb</module>
<module>redis</module>
<module>s3</module>
<module>solr</module>
<module>tarantool</module>
<!--<module>voldemort</module>-->
</modules>
......
<!--
Copyright (c) 2016 YCSB contributors. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You
may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License for the specific language governing
permissions and limitations under the License. See accompanying
LICENSE file.
-->
## Quick Start
This section describes how to run YCSB on Solr running locally.
### 1. Set Up YCSB
Clone the YCSB git repository and compile:
git clone git://github.com/brianfrankcooper/YCSB.git
cd YCSB
mvn -pl com.yahoo.ycsb:solr-binding -am clean package
### 2. Set Up Solr
There must be a running Solr instance with a core/collection pre-defined and configured.
- See this [API](https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-CREATE) reference on how to create a core.
- See this [API](https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1) reference on how to create a collection in SolrCloud mode.
The `conf/schema.xml` configuration file present in the core/collection just created must be configured to handle the expected field names during benchmarking.
Below illustrates a sample from a schema config file that matches the default field names used by the ycsb client:
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="field0" type="text_general" indexed="true" stored="true"/>
<field name="field1" type="text_general" indexed="true" stored="true"/>
<field name="field2" type="text_general" indexed="true" stored="true"/>
<field name="field3" type="text_general" indexed="true" stored="true"/>
<field name="field4" type="text_general" indexed="true" stored="true"/>
<field name="field5" type="text_general" indexed="true" stored="true"/>
<field name="field6" type="text_general" indexed="true" stored="true"/>
<field name="field7" type="text_general" indexed="true" stored="true"/>
<field name="field8" type="text_general" indexed="true" stored="true"/>
<field name="field9" type="text_general" indexed="true" stored="true"/>
If running in SolrCloud mode ensure there is an external Zookeeper cluster running.
- See [here](https://cwiki.apache.org/confluence/display/solr/Setting+Up+an+External+ZooKeeper+Ensemble) for details on how to set up an external Zookeeper cluster.
- See [here](https://cwiki.apache.org/confluence/display/solr/Using+ZooKeeper+to+Manage+Configuration+Files) for instructions on how to use Zookeeper to manage your core/collection configuration files.
### 3. Run YCSB
Now you are ready to run! First, load the data:
./bin/ycsb load solr -s -P workloads/workloada -p table=<core/collection name>
Then, run the workload:
./bin/ycsb run solr -s -P workloads/workloada -p table=<core/collection name>
For further configuration see below:
### Default Configuration Parameters
The default settings for the Solr node that is created is as follows:
- `solr.cloud`
- A Boolean value indicating if Solr is running in SolrCloud mode. If so there must be an external Zookeeper cluster running also.
- Default value is `false` and therefore expects solr to be running in stand-alone mode.
- `solr.base.url`
- The base URL in which to interface with a running Solr instance in stand-alone mode
- Default value is `http://localhost:8983/solr
- `solr.commit.within.time`
- The max time in ms to wait for a commit when in batch mode, ignored otherwise
- Default value is `1000ms`
- `solr.batch.mode`
- Indicates if inserts/updates/deletes should be commited in batches (frequency controlled by the `solr.commit.within.time` parameter) or commit 1 document at a time.
- Default value is `false`
- `solr.zookeeper.hosts`
- A list of comma seperated host:port pairs of Zookeeper nodes used to manage SolrCloud configurations.
- Must be passed when in [SolrCloud](https://cwiki.apache.org/confluence/display/solr/SolrCloud) mode.
- Default value is `localhost:2181`
### Custom Configuration
If you wish to customize the settings used to create the Solr node
you can created a new property file that contains your desired Solr
node settings and pass it in via the parameter to 'bin/ycsb' script. Note that
the default properties will be kept if you don't explicitly overwrite them.
Assuming that we have a properties file named "myproperties.data" that contains
custom Solr node configuration you can execute the following to
pass it into the Solr client:
./bin/ycsb run solr -P workloads/workloada -P myproperties.data -s
If you wish to use SolrCloud mode ensure a Solr cluster is running with an
external zookeeper cluster and an appropriate collection has been created.
Make sure to pass the following properties as parameters to 'bin/ycsb' script.
solr.cloud=true
solr.zookeeper.hosts=<zkHost2>:<zkPort1>,...,<zkHostN>:<zkPortN>
<?xml version="1.0" encoding="UTF-8"?>
<!--
Copyright (c) 2012 - 2016 YCSB contributors.
All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You
may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License for the specific language governing
permissions and limitations under the License. See accompanying
LICENSE file.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>binding-parent</artifactId>
<version>0.7.0-SNAPSHOT</version>
<relativePath>../binding-parent</relativePath>
</parent>
<artifactId>solr-binding</artifactId>
<name>Solr Binding</name>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>core</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-solrj</artifactId>
<version>${solr.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-checkstyle-plugin</artifactId>
<version>2.15</version>
<configuration>
<consoleOutput>true</consoleOutput>
<configLocation>../checkstyle.xml</configLocation>
<failOnViolation>true</failOnViolation>
<failsOnError>true</failsOnError>
</configuration>
<executions>
<execution>
<id>validate</id>
<phase>validate</phase>
<goals>
<goal>checkstyle</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
/**
* Copyright (c) 2016 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db;
import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.Status;
import com.yahoo.ycsb.StringByteIterator;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.impl.CloudSolrClient;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.client.solrj.response.UpdateResponse;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.SolrInputDocument;
import java.io.IOException;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map.Entry;
import java.util.Properties;
import java.util.Set;
import java.util.Vector;
/**
* Solr client for YCSB framework.
*
* <p>
* Default properties to set:
* </p>
* <ul>
* See README.md
* </ul>
*
*/
public class SolrClient extends DB {
public static final String DEFAULT_CLOUD_MODE = "false";
public static final String DEFAULT_BATCH_MODE = "false";
public static final String DEFAULT_ZOOKEEPER_HOSTS = "localhost:2181";
public static final String DEFAULT_SOLR_BASE_URL = "http://localhost:8983/solr";
public static final String DEFAULT_COMMIT_WITHIN_TIME = "1000";
private org.apache.solr.client.solrj.SolrClient client;
private Integer commitTime;
private Boolean batchMode;
/**
* Initialize any state for this DB. Called once per DB instance; there is one DB instance per
* client thread.
*/
@Override
public void init() throws DBException {
Properties props = getProperties();
commitTime = Integer
.parseInt(props.getProperty("solr.commit.within.time", DEFAULT_COMMIT_WITHIN_TIME));
batchMode = Boolean.parseBoolean(props.getProperty("solr.batch.mode", DEFAULT_BATCH_MODE));
// Check if Solr cluster is running in SolrCloud or Stand-alone mode
Boolean cloudMode = Boolean.parseBoolean(props.getProperty("solr.cloud", DEFAULT_CLOUD_MODE));
System.err.println("Solr Cloud Mode = " + cloudMode);
if (cloudMode) {
System.err.println("Solr Zookeeper Remote Hosts = "
+ props.getProperty("solr.zookeeper.hosts", DEFAULT_ZOOKEEPER_HOSTS));
client = new CloudSolrClient(
props.getProperty("solr.zookeeper.hosts", DEFAULT_ZOOKEEPER_HOSTS));
} else {
client = new HttpSolrClient(props.getProperty("solr.base.url", DEFAULT_SOLR_BASE_URL));
}
}
@Override
public void cleanup() throws DBException {
try {
client.close();
} catch (IOException e) {
throw new DBException(e);
}
}
/**
* Insert a record in the database. Any field/value pairs in the specified values HashMap will be
* written into the record with the specified record key.
*
* @param table
* The name of the table
* @param key
* The record key of the record to insert.
* @param values
* A HashMap of field/value pairs to insert in the record
* @return Zero on success, a non-zero error code on error. See this class's description for a
* discussion of error codes.
*/
@Override
public Status insert(String table, String key, HashMap<String, ByteIterator> values) {
try {
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", key);
for (Entry<String, String> entry : StringByteIterator.getStringMap(values).entrySet()) {
doc.addField(entry.getKey(), entry.getValue());
}
UpdateResponse response;
if (batchMode) {
response = client.add(table, doc, commitTime);
} else {
response = client.add(table, doc);
client.commit(table);
}
return checkStatus(response.getStatus());
} catch (IOException | SolrServerException e) {
e.printStackTrace();
}
return Status.ERROR;
}
/**
* Delete a record from the database.
*
* @param table
* The name of the table
* @param key
* The record key of the record to delete.
* @return Zero on success, a non-zero error code on error. See this class's description for a
* discussion of error codes.
*/
@Override
public Status delete(String table, String key) {
try {
UpdateResponse response;
if (batchMode) {
response = client.deleteById(table, key, commitTime);
} else {
response = client.deleteById(table, key);
client.commit(table);
}
return checkStatus(response.getStatus());
} catch (IOException | SolrServerException e) {
e.printStackTrace();
}
return Status.ERROR;
}
/**
* Read a record from the database. Each field/value pair from the result will be stored in a
* HashMap.
*
* @param table
* The name of the table
* @param key
* The record key of the record to read.
* @param fields
* The list of fields to read, or null for all of them
* @param result
* A HashMap of field/value pairs for the result
* @return Zero on success, a non-zero error code on error or "not found".
*/
@Override
public Status read(String table, String key, Set<String> fields,
HashMap<String, ByteIterator> result) {
try {
Boolean returnFields = false;
String[] fieldList = null;
if (fields != null) {
returnFields = true;
fieldList = fields.toArray(new String[fields.size()]);
}
SolrQuery query = new SolrQuery();
query.setQuery("id:" + key);
if (returnFields) {
query.setFields(fieldList);
}
final QueryResponse response = client.query(table, query);
SolrDocumentList results = response.getResults();
if ((results != null) && (results.getNumFound() > 0)) {
for (String field : results.get(0).getFieldNames()) {
result.put(field,
new StringByteIterator(String.valueOf(results.get(0).getFirstValue(field))));
}
}
return checkStatus(response.getStatus());
} catch (IOException | SolrServerException e) {
e.printStackTrace();
}
return Status.ERROR;
}
/**
* Update a record in the database. Any field/value pairs in the specified values HashMap will be
* written into the record with the specified record key, overwriting any existing values with the
* same field name.
*
* @param table
* The name of the table
* @param key
* The record key of the record to write.
* @param values
* A HashMap of field/value pairs to update in the record
* @return Zero on success, a non-zero error code on error. See this class's description for a
* discussion of error codes.
*/
@Override
public Status update(String table, String key, HashMap<String, ByteIterator> values) {
try {
SolrInputDocument updatedDoc = new SolrInputDocument();
updatedDoc.addField("id", key);
for (Entry<String, String> entry : StringByteIterator.getStringMap(values).entrySet()) {
updatedDoc.addField(entry.getKey(), Collections.singletonMap("set", entry.getValue()));
}
UpdateResponse writeResponse;
if (batchMode) {
writeResponse = client.add(table, updatedDoc, commitTime);
} else {
writeResponse = client.add(table, updatedDoc);
client.commit(table);
}
return checkStatus(writeResponse.getStatus());
} catch (IOException | SolrServerException e) {
e.printStackTrace();
}
return Status.ERROR;
}
/**
* Perform a range scan for a set of records in the database. Each field/value pair from the
* result will be stored in a HashMap.
*
* @param table
* The name of the table
* @param startkey
* The record key of the first record to read.
* @param recordcount
* The number of records to read
* @param fields
* The list of fields to read, or null for all of them
* @param result
* A Vector of HashMaps, where each HashMap is a set field/value pairs for one record
* @return Zero on success, a non-zero error code on error. See this class's description for a
* discussion of error codes.
*/
@Override
public Status scan(String table, String startkey, int recordcount, Set<String> fields,
Vector<HashMap<String, ByteIterator>> result) {
try {
Boolean returnFields = false;
String[] fieldList = null;
if (fields != null) {
returnFields = true;
fieldList = fields.toArray(new String[fields.size()]);
}
SolrQuery query = new SolrQuery();
query.setQuery("*:*");
query.setParam("fq", "id:[ " + startkey + " TO * ]");
if (returnFields) {
query.setFields(fieldList);
}
query.setRows(recordcount);
final QueryResponse response = client.query(table, query);
SolrDocumentList results = response.getResults();
HashMap<String, ByteIterator> entry;
for (SolrDocument hit : results) {
entry = new HashMap<String, ByteIterator>((int) results.getNumFound());
for (String field : hit.getFieldNames()) {
entry.put(field, new StringByteIterator(String.valueOf(hit.getFirstValue(field))));
}
result.add(entry);
}
return checkStatus(response.getStatus());
} catch (IOException | SolrServerException e) {
e.printStackTrace();
}
return Status.ERROR;
}
private Status checkStatus(int status) {
Status responseStatus;
switch (status) {
case 0:
responseStatus = Status.OK;
break;
case 400:
responseStatus = Status.BAD_REQUEST;
break;
case 403:
responseStatus = Status.FORBIDDEN;
break;
case 404:
responseStatus = Status.NOT_FOUND;
break;
case 500:
responseStatus = Status.ERROR;
break;
case 503:
responseStatus = Status.SERVICE_UNAVAILABLE;
break;
default:
responseStatus = Status.UNEXPECTED_STATE;
break;
}
return responseStatus;
}
}
/*
* Copyright (c) 2016 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
/**
* The YCSB binding for
* <a href="http://lucene.apache.org/solr/">Solr</a>.
*/
package com.yahoo.ycsb.db;
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment