Using Mongodb in .NET

by
Joe Feser
of
Logical Advantage
@joefeser Twitter
joefeser github

What is it?

MongoDB is an open-source NoSQL database used by companies of all sizes, across all industries and for a wide variety of applications. It is an agile database that allows schemas to change quickly as applications evolve, while still providing the functionality developers expect from traditional databases, such as secondary indexes, a full query language and strict consistency.


http://www.mongodb.com/mongodb-overview

Features


  • JSON Data Model with Dynamic Schemas
  • Auto-Sharding for Horizontal Scalability
  • Built-In Replication for High Availability
  • Rich Secondary Indexes, including geospatial and TTL indexes
  • Text Search
  • Aggregation Framework & Native MapReduce
  • Hadoop Integration
  • Drivers: 10 MongoDB-Supported Drivers; 40 Community-Supported Drivers
    http://www.mongodb.com/mongodb-overview

Drivers in many languages

JavaScript
Python
Ruby
PHP
Perl
Java
Scala
C#
C
C++
Haskell
Erlang

Use Cases


MongoDB is a general purpose database suitable for most applications and use cases.

http://www.mongodb.com/use-cases

Use Cases

  • Big Data
  • Content Management and Delivery
  • Customer Data Management
  • Data Hub
  • Database-as-a-Service
  • Internet of Things
  • Mobile Apps
  • Product and Asset Catalogs
  • Security and Fraud Apps
  • Social and Collaboration Apps

http://www.mongodb.com/use-cases

Setting up mongodb

Download the latest build




http://www.mongodb.org/downloads

Unzip the files


I use c:\mongodb\bin

Create a folder for the data


The default is c:\data\db
This is the default folder on the same drive as mongod.exe. You can also pass a different folder as a command line option (dbpath).

Start MongoD




From the c:\mongodb\bin folder

More Information

http://docs.mongodb.org/manual/tutorial/install-mongodb-on-windows/

Terms vs Sql Server

SQL Terms/Concepts MongoDB Terms/Concepts
database database
table collection
row document or BSON document
column field
index index

http://docs.mongodb.org/manual/reference/sql-comparison/

Terms vs SQL Server

SQL Terms/Concepts MongoDB Terms/Concepts
table joins embedded documents and linking
primary key primary key
Specify any unique column or column combination as primary key. In MongoDB, the primary key is automatically set to the _id field.
aggregation (e.g. group by) aggregation pipeline


http://docs.mongodb.org/manual/reference/sql-comparison/

Data types

  • Object id
  • Integer 32 / 64 bit
  • Double
  • String
  • Object
  • Array
  • Binary data
  • Boolean
  • Date
  • Null

BSON Data types

  • Type
  • Double
  • String
  • Object
  • Array
  • Binary data
  • Object id
  • Boolean
  • Date
  • Null
  • Regular Expression
  • JavaScript
  • Symbol
  • JavaScript (with scope)
  • 32-bit integer
  • Timestamp
  • 64-bit integer
  • Min key
  • Max key

http://docs.mongodb.org/manual/reference/bson-types/

How to connect to MongoDB from .Net

  • Configure a connection
  • Create a MongoClient object
  • Create a MongoServer
  • Get access to a MongoDB database
  • Consider using static properties

How to connect to MongoDB from .Net

public static string connectionString = "mongodb://localhost";
public static MongoClient Client = new MongoClient(connectionString);
public static MongoServer Server = Client.GetServer();
public static MongoDatabase MainDatabase = Server.GetDatabase("commerce");

How do you access MongoDB collections

The MongoDatabase object has a method on it called GetCollection<T> allowing you a strongly types object reference.

You are not limited to storing one type of object in a collection, but I do to keep from making mistakes, not having _id collisions, and to keep sane.
public static MongoCollection<MongoProduct> ProductCollection 
    = Mongo.MainDatabase.GetCollection<MongoProduct>("product");

How to set up your classes for serialization

By default, all of your properties will be serialized, as long as they are not read only (only a get)

There are many ways to set up your objects for serialization. This gives you total control over how your objects are serialized and deserialized

You can use POCO objects with no markup using the BsonClassMap

How to set up your classes for serialization

Custom Attributes

[BsonIgnoreIfNull]
[BsonElement("esrb")]
public string ESRBAgeRating {
    get;
    set;
}
 
[BsonElement("format")]
public string Format {
    get;
    set;
}

How to set up your classes for serialization

BsonClassMap
This will be a very familiar pattern for those that are used to AutoMapper

BsonClassMap.RegisterClassMap<MongoProduct>(cm => {
    cm.AutoMap();
    cm.SetIgnoreExtraElements(true);
    cm.UnmapProperty(c => c.BuyItNowPrice);
    cm.GetMemberMap(c => c.Format).SetIgnoreIfNull(true);
    cm.GetMemberMap(c => c.Genre).SetIgnoreIfNull(true);
});

How to set up your classes for serialization

Decimal types are not supported! 

How do I save them so they are not saved as strings?

How to set up your classes for serialization

Your best bet is to convert the value to an integer or long data type.

Pick a common scale for all decimals in a type (or globally) so the user will not have to guess how to convert your value back

How to set up your classes for serialization

Using another property in your class to serialize the object. 
Not as clean but no mapper is needed. 
This is problematic for query operations in c#.
[BsonElement("listPrice")]
public int ListPriceInteger {
    get {
        return (int)(ListPrice * 10000); //4 dec places
    }
    set {
        ListPrice = value / 10000;
    }
}
 
//MSRP
[BsonIgnore]
public decimal ListPrice {
    get;
    set;
}

How to set up your classes for serialization

Create a serializer and use that in your BsonClassMap to write the value
BsonClassMap.RegisterClassMap<Dimension>(cm => {
    cm.AutoMap();
    cm.GetMemberMap(c => c.Height).SetSerializer(new DecimalSerializer())
        .SetElementName("height");
    cm.GetMemberMap(c => c.Length).SetSerializer(new DecimalSerializer())
        .SetElementName("length");
    cm.GetMemberMap(c => c.Weight).SetSerializer(new DecimalSerializer())
        .SetElementName("weight");
    cm.GetMemberMap(c => c.Width).SetSerializer(new DecimalSerializer())
        .SetElementName("width");
});

How to set up your classes for serialization

This is the original Dimension class. It is a POCO. All operations are performed in the BsonClassMap
public class Dimension {
 
    public decimal Length {
        getset;
    }
 
    public decimal Height {
        getset;
    }
 
    public decimal Width {
        getset;
    }
 
    public decimal Weight {
        getset;
    }
}

What does this look like in MongoDB?

"dimensions" : {
    "length" : NumberLong(100400),
    "height" : NumberLong(5000),
    "width" : NumberLong(90100),
    "weight" : NumberLong(14100)
  }

More information

  • http://docs.mongodb.org/ecosystem/tutorial/serialize-documents-with-the-csharp-driver/
  • http://docs.mongodb.org/ecosystem/tutorial/use-csharp-driver/

Retrieving data

Retrieving one record

var query = Query<MongoProduct>
    .Where(item => item.ASIN == asinName);
 
var asin = Collections.ProductCollection.FindOne(query);

Retrieving data

Retrieving all items that are less than .1 lbs

var query = Query<MongoProduct>
    .Where(item => item.PackageDimensions.Weight < .1m);
var itemsVeryLight = Collections.ProductCollection
    .FindAs<MongoProduct>(query).ToList();

Retrieving data

Projecting just the needed values to save processing and bandwidth.
The "Id" property will always be set

//does work
var asin2 = Collections.ProductCollection.Find(query)
    .SetFields(Fields.Include("listPrice""title")).FirstOrDefault();
//does not work
var asin2a = Collections.ProductCollection.Find(query)
    .SetFields(Fields.Include("ListPriceInteger""Title")).FirstOrDefault();
 
//must be the c# class properties.
var asin3 = Collections.ProductCollection.Find(query)
    .SetFields(Fields<MongoProduct>.Include(c => c.ListPriceIntegerc => c.Title)).FirstOrDefault();

retrieving Data

Find
FindOne

Saving data

If you know that without a doubt that the object does not exist with the same id, you can use insert.

var product = new MongoProduct() { };
 
Collections.ProductCollection.Insert<MongoProduct>(product);

Saving data

If you are not sure if the id exists or you are performing an update, use the save method. This method will determine if an insert or update must happen.

If you want it to fail if the object already exists and you want it to be a new record, call insert. Save will override whatever is in the database

var product = new MongoProduct() { };
 
Collections.ProductCollection.Save<MongoProduct>(product);

Bulk Inserts

If you have many objects to insert at once and they are all inserts, you can use the InsertBatch method.

It takes in an IEnumerable<T>.

It is very fast. For the sample data, 25,000 records were inserted in 1600 ms

Collections.ProductCollection.InsertBatch<Product>(list);

Atomic Updates

  • There is no concept of a pessimistic lock
  • Atomic updates are per document
  • Reading, modifying the document and saving is not atomic
  • Transactions do not exist

Atomic Updates

findAndModify is the MongoDB method that performs atomic updates.
You provide the method with:
  • A Query (optional)
  • A Sort operator (optional)
  • If the document should be removed
  • The Update Statement
  • If you want the new document returned
  • The fields to return (optional)
  • If an upsert should be performed
  • The values to push if the item is new (optional)

Atomic Updates

We have a collection that is storing the view counts for each product in our system. When a user views the page, we want to increment a counter and also set the last view time. Optionally we want to add a keyword that was used when the user searched.
var asinName = "B00000J1TX";
var query = Query.And(Query.EQ("_id"asinName));
var sortBy = SortBy.Null;
var update = Update.Inc("counter"1).Set("lastView", DateTime.Now)
    .SetOnInsert("keywords"new BsonArray(new[] { "new" }));
var result = Collections.ProductViewCollection
    .FindAndModify(querysortByupdatetruetrue);
 
var found = Collections.ProductViewCollection.FindOne(query);

Atomic Updates

  • findAndModify will only update one record, no matter how you use it.
  • If multiple documents need to be updated, you must use the Update method.
  • Each document update will be atomic but the update as a whole will not
  • If there is a failure while performing your update, there is no way to know which documents were updated and which ones were not.

Atomic Updates

Update all documents in the collection

var update = Update.Inc("counter"1);
var result = Collections.ProductViewCollection
    .Update(nullupdate); 

Atomic Updates

Updating all documents that contain a keyword and only add a new keyword if it does not already exist.

var array = new List<string>() { "new" };
var query = Query<MongoProductView>.All(item => item.Keywordsarray);
//show that we did return a product.
var found = Collections.ProductViewCollection.Find(query).ToList();
//Add keyword to all documents that have a keyword of new
var update = Update.AddToSet("keywords""test");
//you could also use push if you wanted it added no matter what.
//var update = Update.Push("keywords", "test");
Collections.ProductViewCollection.Update(queryupdate);
 

What an e-commerce database would look like in SQL Server


What an e-commerce database would look like in MongoDB

  • Foreign Keys do not exist in MongoDB
  • You may use References or store the data in the document
  • Depending on how often you need to display or work with those values will dictate if you have denormalized data

MongoDB Model


Adding a new column


ALTER TABLE ProductView
ADD extra nvarchar(10)

var update = Update.Set("extra""");
var result = Collections.ProductViewCollection
    .Update(nullupdate);

Removing a column


ALTER TABLE ProductView
DROP COLUMN extra

var update = Update.Unset("extra");
var result = Collections.ProductViewCollection
    .Update(nullupdate);

Creating an index


CREATE INDEX idx_counter
ON ProductView(counter)

Collections.ProductViewCollection.EnsureIndex(IndexKeys<MongoProductView>
    .Ascending(item => item.Counter), IndexOptions.SetBackground(true));

Drop Table


DROP TABLE ProductView

Collections.ProductViewCollection.Drop();

Select all records from collection


Select * from ProductView

Collections.ProductCollection.FindAll();

select records with an order by


Select * from Product where asin = '123'
order by Title

Collections.ProductCollection.Find(query)
    .SetSortOrder(SortBy<MongoProduct>.Ascending(item => item.Title));

Get the count of rows


Select count(*) from Product where ListPrice < 4.99

var query = Query<MongoProduct>.Where(item => item.ListPrice < 4.99m);
var count = Collections.ProductCollection.Count(query);

Get the Distinct values of a column


Select Distinct(Title) from Product

Collections.ProductCollection.Distinct("Title"query);

MAp Reduce

  • MongoDB has built in functionality to perform Map Reduce
  • It is much easier to get up and running than Hadoop
  • Can be used as a connector for Hadoop

http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-hadoop/
http://docs.mongodb.org/ecosystem/use-cases/hadoop/

Aggregation Framework

This is an alternative to using Map Reduce
Results must be less than 16mb (document max)

Challenges of migrating to MongoDB for a SQL Server user

  • "Like" Statements are very slow
  • Full Text search can only exist for one field per collection
  • If a value changes on a denormalized field, you must perform an update statement to keep the values in sync
  • Update statements are not intuitive
  • No transactions across multiple rows (documents)
  • Two phase commits difficult to understand and keeping state for a rollback is even more difficult

Search can be solved

  • ElasticSearch is an option and is free
  • Requires MongoDB be set up as a Replica Set

http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/
http://www.elasticsearch.org/

Two Phase commits

  • It is possible and allows for massive scale out but you need a coordinator
  • Additional reading can be found at

http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/

Information about Text indexes


http://docs.mongodb.org/v2.4/reference/command/text/
http://docs.mongodb.org/v2.4/core/index-text/#index-feature-text

More information

  • Mongodb open data day 2014: http://www.slideshare.net/davidjeet/mongodb-open-data-day-2014
  • Configuration options: http://docs.mongodb.org/manual/reference/configuration-options/
  • Manual: http://docs.mongodb.org/manual/
  • Using the driver: http://docs.mongodb.org/ecosystem/tutorial/use-csharp-driver/
  • Serialization: http://docs.mongodb.org/ecosystem/tutorial/serialize-documents-with-the-csharp-driver/
  • CSharp Driver LINQ Tutorial: http://docs.mongodb.org/ecosystem/tutorial/use-linq-queries-with-csharp-driver/
  • Windows Azure: http://docs.mongodb.org/ecosystem/platforms/windows-azure/