preview

.NET LINQ from Scratch

As software developers, we spend a lot of time extracting and displaying data from many different data sources. Whether it’s a XML webservice of some sort, or a full featured relational database, we have been forced to learn different methods of data access. Wouldn’t it be great if the method of access was the same for all data sources? Well, we are in luck because, as of the release of C# 3.0 and the .NET 3.5 Framework, LINQ has come to change the game forever.

Tutorial Details

  • Introduction to LINQ syntax
  • Projections using LINQ
  • Refining data
  • Standard operators

Current Data Access Overview

On the .NET platform we have been and still are utilizing ADO.NET
for accessing different data sources. The open source community has also provided
the developers with a number of alternatives.

Language Integrated Query is the new addition to the .NET
family and as the name suggests it’s the kind of query style data access which
is fully supported by the language to effectively unify the way we access data
and to make our lives easier. LINQ is able to target a number of different sources namely Oracle,
MSSQL, XML and a few others, but for now we will focus on the most basic of
all, LINQ to Objects.

LINQ to Objects

Normally, to process and refine the data within our lists
and various other data structures, we have used either the ‘foreach’ loop or another
type of looping method to iterate through the objects and process them one by
one according to some condition. This is fine, but frankly it requires a lot of
basic coding that we all wish we didn’t have to write. Essentially we’ve had to tell the
compiler every single detail of the process in order to manipulate the data.

This is exactly where LINQ shines best. What LINQ allows us
to do is to simply tell the compiler what we’d like to perform and let the compiler work
out the best way to actually achieve that. If you’ve used SQL syntax before, the massive resemblances
between LINQ and any dialects of SQL will be the first thing that you’ll notice.
Like SQL, LINQ too supports the “select”, “from”, “where”, “join”, “group by”
and “order by” keywords. Here is a simple example of querying a list of objects:

List initialization:

List<Car> ListOfCars = new List<Car>()
{
    new Car {name = "Toyota"    , owner = "Alex" , model = 1992},
    new Car {name = "Mitsubishi", owner = "Jeff" , model = 2005},
    new Car {name = "Land Rover", owner = "Danny", model = 2001},
    new Car {name = "BMW"       , owner = "Danny", model = 2006},
    new Car {name = "Subaru"    , owner = "Smith", model = 2003}
};

The query:

IEnumerable<Car> QueryResult = from car in ListOfCars
                               select car;

The first part of the preceding code simply populates a list
with four instance of the ‘Car’ class. The next part of the code, however, uses the
“from” and “select” keywords to select a group of objects. The main difference
between SQL and LINQ is that the “from” keyword comes before the “select”
keyword because we must first define the object we want to operate on. Finally
the “select” clause tells the compiler what we wish to extract in this query. The above
code simply extracts everything that is in the list and assigns it to the “QueryResult”
variable.

When we query things from objects (LINQ to Objects) our
queries always return an “IEnumrable<T>” list of objects. Essentially the
“IEnumerable” type is the kind of list that exposes the enumerator, which
supports a simple iteration over a non-generic collection, and <T>
is the type of each entry in the list.

Don’t worry if you aren’t familiar with “enumerators” and “generics”. Just
remember that the result of LINQ queries is always a collection like data
structure which allows for iterating through it using a loop like shown
bellow:

foreach(Car car in QueryResult)
    Console.WriteLine(car.name);

We learned that LINQ always returns a collection structure similar
to any other lists. However, the LINQ query does not execute until its result is
accessed by some other piece of code, like the “foreach” loop above. This is to
allow us to continuously define the query without the overhead by re-evaluating
each new step in the query.

Projections

So far so good; but most of the time, our queries will need
to be more complex; so let’s try projecting data. In SQL, Projection means selecting
the name of the column(s) of table(s) which one wishes to see appearing in the result
of the query. In the case of LINQ to Objects, performing Projection will result
in a different query result type than the type of object that we perform the
query on.

There are two kinds of Projections that we can do. We can
either perform a Projection based on an existing object type, or go completely
the other way by using anonymous types. The following example is of the first
kind:

IEnumerable<CarOwner> QueryResult = from car in ListOfCars
                                    select new CarOwner { owner_name = car.owner };

In the preceding code, the type of the query result is declared as
<CarOwner>, which is different to <Car>, the type that ‘ListOfCar’ variable is initialized with. We have
also used the “new” keyword and have done some assignments inside the curly
brackets. In the above code, using “select” with the “new” keyword tells the
compiler to instantiate a new ‘CarOwner’ object for every entry in the query result.
Also by assigning values to the new type we have initialized each instance
of the ‘CarOwner’ class.

Nonetheless, if you don’t already have a type defined to
use, you can still perform projections using anonymous types.

Projections using Anonymous Types

It would be a big hassle if, for every Projection, you were
forced to create a new type. That is why, as of C# 3.0, support for Anonymous
types was added to the language. An Anonymous type is declared using the “var”
keyword. It tells the compiler that the type of the variable is unknown until
it’s assigned for the first time.

var QueryResult = from car in ListOfCars
                  select new { 
                      car_name = car.name, 
                      owner_name = car.owner 
                  };

foreach(var entry in QueryResult)
    Console.WriteLine(entry.car_name);

The above is an example of performing a query with Anonymous
types. The only catch to look out for is that the compiler will not
allow returning Anonymous types from methods.

Accessing the properties of an Anonymous type is easy. In Visual Studio 2008, the Code
Completion/Intellisense also lists the properties exposed by the Anonymous type.

Refining Data

Usually as part of the LINQ query, we also need to refine the
query result by specifying a condition. Just like SQL, LINQ too uses the “where”
clause to tell the compiler what conditions are acceptable.

IEnumerable<Car> QueryResult = from car in ListOfCars
                               where car.name == "Subaru"
                               select car;

The preceding code demonstrate the use of the “where” clause and
the condition to follow. To further to define multiple conditions, LINQ supports
the ‘and’ (&&amp) and ‘or’ (||) constructs. The “where” part of the query has to always be a
Boolean expression, otherwise the compiler will complain.

Order By

When querying objects, it’s possible to rely on the query
target being already sorted. If that isn’t the case, LINQ can take care of that
by using the “order by” clause which will ensure the result of your query is
properly sorted.

IEnumerable<Car> QueryResult = from car in ListOfCars
                               orderby car.model
                               select car;

If you run the above code, you’ll see that the result of the
query is sorted in ascending order. You can alter the order by using the “ascending” and “descending”
keywords, and further change the order by specifying more than one field to sort
by. The following code shows how:

IEnumerable<Car> QueryResult = from car in ListOfCars
                               orderby car.model descending
                               select car;

Grouping

LINQ also allows grouping the query result by the value of a
specific property as shown in this example:

var QueryResult = from car in ListOfCars
                  group car by car.owner into carOwnersGroup
                  select carOwnersGroup.Key;

As you can see, LINQ supports the “group by” clause to
specify what object and by what property to group by. The “into” keyword will
then allow us to project on a grouping result which can be accessed by the “Key”
property.

Joins

LINQ supports joining data from different collections into one
query result. You can do this using the “join” keyword to specify what objects
to join and use the “on” keyword to specify the matching relationship between
the two objects.

Initializing related list:

List<Car> ListOfCars = new List<Car>()
{
    new Car {name = "Mitsubishi", owner = "Jeff" , model = 2005},
    new Car {name = "Land Rover", owner = "Danny", model = 2001},
    new Car {name = "Subaru"    , owner = "Smith", model = 2003},
    new Car {name = "Toyota"    , owner = "Alex" , model = 1992},
    new Car {name = "BMW"       , owner = "Danny", model = 2006},
};

List<CarOwner> ListOfCarOwners = new List<CarOwner>()
{
    new CarOwner {owner_name = "Danny", age = 22},
    new CarOwner {owner_name = "Jeff" , age = 35},
    new CarOwner {owner_name = "Smith", age = 19},
    new CarOwner {owner_name = "Alex" , age = 40}
};

Query:

var QueryResult = from car in ListOfCars
                  join carowner in ListOfCarOwners on car.owner equals carowner.owner_name
                  select new {name = car.name, owner = car.owner, owner_age = carowner.age};

In the above code, using an Anonymous type, we have joined
the two objects in a single query result.

Object Hierarchies using Group Joins

So far, we’ve learned how we can use LINQ to build a flat
list query result. With LINQ, it’s also possible to achieve a hierarchical query
result using “GroupJoin”. In simple words, we could assign objects to
properties of every entry with LINQ query.

List<Car> ListOfCars = new List<Car>()
{
    new Car {name = "Mitsubishi", owner = "Jeff" , model = 2005},
    new Car {name = "Land Rover", owner = "Danny", model = 2001},
    new Car {name = "Subaru"    , owner = "Smith", model = 2003},
    new Car {name = "Toyota"    , owner = "Alex" , model = 1992},
    new Car {name = "BMW"       , owner = "Danny", model = 2006},
};

List<CarOwner> ListOfCarOwners = new List<CarOwner>()
{
    new CarOwner {owner_name = "Danny", age = 22},
    new CarOwner {owner_name = "Jeff" , age = 35},
    new CarOwner {owner_name = "Smith", age = 19},
    new CarOwner {owner_name = "Alex" , age = 40}
};

var QueryResult = from carowner in ListOfCarOwners
                  join car in ListOfCars on carowner.owner_name equals car.owner into carsGroup
                  select new {name = carowner.owner_name, cars = carsGroup};

foreach(var carOwner in QueryResult) 
    foreach(var car in carOwner.cars)
        Console.WriteLine("Owner name: {0}, car name: {1}, car model: {2}", carOwner.name, car.name, car.model);

In the above example, the “Join” clause is followed by an “into”
part. This differs to the previous join operation that we looked at. Here, the “into”
clause is used to group cars by the owner (into carsGroup) and assign the grouping to the
“cars” property of the anonymous type.

Standard Query Operators

Thus far, everything that we’ve seen has been supported by the C# 3.0
syntax. However, there is still a large number of operations that C# 3.0 does not
support. The standard query operators provide query capabilities including
filtering, projection, aggregation, sorting and more. These operations are therefore supported
as methods of the LINQ library and can be executed on result of a query like shown in the
following screenshot:

These operators are listed below for your reference.

Aggregate Operators

  • Sum: returns the sum of all entries
  • Max: returns the entry with the maximum value
  • Min: returns the entry with the minimum value
  • Average: returns the average value for the collection
  • Aggregate: used for creating a customized aggregation
  • LongCount: when dealing with a large collection, this method will return a value up to the largest value supported by the “long” class
  • Count: returns an “integer” for the count of items in the collection

Element Operators

  • First: returns the first entry from the result collection
  • FirstOrDefault: if empty collection, will return the default value, otherwise will return the first entry from the collection
  • Single: will return only element from the collection
  • SingleOrDefault: if empty collection, will return the default value, otherwise will return only element from the collection
  • Last: returns the last entry from collection
  • LastOrDefault: if empty collection, will return the default value, otherwise returns the last entry from collection
  • ElementAt: returns the element at the specified position
  • ElementAtOrDefault: if empty collection, will return the default value, otherwise returns the element at the specified position

Set Related Operators

  • Except: similar to the left join in SQL, will return entries from the one set that doesn’t exist in another set
  • Union: returns all entries from both objects
  • Intersect: returns the same elements from either sets
  • Distinct: returns unique entries from the collection

Generation Operators

  • DefaultIfEmpty: if result is empty, returns default value
  • Repeat: repeats on returning objects for specified number of times
  • Empty: will return an empty IEnumerable collection
  • Range: returns a range of numbers for a specified starting number and count

Refining Operators

  • Where: will return objects that meet the specified condition
  • OfType: will return objects of the specified type

Conversion Operators

  • ToLookup: returns the result as a lookup
  • ToList: returns the result as a List collection
  • ToDictionary: returns the result as a dictionary
  • ToArray: returns the result as an Array collection
  • AsQueryable: returns the result as a IQueryable<T>
  • AsEnumerable: returns the result as a IEnumerable<T>
  • OfType: filters the collection according to the specified type
  • Cast: used to convert a weakly typed collection into a strongly typed collection

Partitioning Operators

  • Take: returns a specified number of records
  • Takewhile: returns a specified number of records while the specified condition evaluates to true
  • Skip: skips specified number of entries and returns the rest
  • SkipWhile: skips specified number of entries while the specified condition evaluates to true

Quantifier Operators

  • Any: returns true or false for a specified condition
  • Contains: returns true or false for existence of the specified object
  • All: returns true or false to all objects meeting the specified condition

Join Operators

  • Join: returns entries where keys in sets are the same
  • GroupJoin: used to build hierarchical objects based on a master and detail relationship

Equality Operators

  • SequenceEqual: returns true when collections are equal

Sorting Operators

  • Reverse: returns a reversed collection
  • ThenBy: used to perform further sorting
  • ThenByDescending: used to perform further sorting in descending order
  • OrderBy: used to define order
  • OrderByDescending: used to define descending order

Projection Operators

  • SelectMany: used to flatten a hierarchical collection
  • Select: used to identify the properties to return

Concatenation Operators

  • Concat: used to concatenate two collections

So What Now?

LINQ has proven itself to be very useful for querying objects, and the SQL-like syntax makes it easy to
learn and use. Also, the vast number of Standard Operators makes it possible to chain a number of operators
to perform complex queries. In a follow-up to this tutorial, we’ll review how LINQ can be used to
query databases and XML content..

Sell .NET Scripts and Components on CodeCanyon

Arman Mirkazemi is DenonStudio on Codecanyon
Tags: .net
Note: Want to add some source code? Type <pre><code> before it and </code></pre> after it. Find out more
  • ShadowAssassin

    Thank you mate, the operator list should be useful :)

  • Ryan

    Much appreciated. I’ve been wanting to get familiar with this but didn’t want to go buy a huge book on the subject and the MSDN lessons leave much to be desired.

    Can’t wait to see the next tutorial in the series!

  • Stuart Allen

    great writeup on LINQ, its very easy to forget the syntax, this will be a bookmarked resource for sure!

  • http://www.freshclickmedia.com Shane

    I’m a big fan of LINQ – very powerful and elegant stuff.

  • Ryan

    Finally someone wrote a LINQ tutorial that is easy to follow and not a novel. Awesome work.

  • khaled

    I love this feature i use it a lot in my C# projects

  • http://tutorial-city.net Tutorial City

    The author url (in the author bio) is missing

  • http://spotdex.com/ David Moreen

    Dude these LINQ tutorials are one of the more unique tutorials you could find. Not to mention good quality tutorials, glad to see one on nettuts!

  • http://laranzjoe.blogspot.com lawrence77

    Awesome just dreaming about LINQ posts, :D

    excepting a detailed LINQ tut as plus tut :)

    Nice tut by the way arman!

  • Skunkie

    LINQ rocks. The coolest thing right now is LINQ to Entities. With Microsofts new Entity Framework (which is built on ADO.NET) you can do a complete object-relational mapping from highly complex databases with just a few mouseclicks, and then you can use the elegant LINQ syntax to query against the auto-created entity objects, not the database anymore. And the objects take care of themselves, they know about their relationships to other objects (the database foreign keys) and their state (changed, updated, saved, whatever).

    This way you if you want all the customers with orders of more than $1.000 you could write the following LINQ query (C#)

    var importantCustomers =
    from customer in my.Customers
    where customer.Orders.Any((order) => order.Total > 1.000
    select customer;

    and than you could for example loop over every importantCustomer in importantCustomers and get the customer street simply with the following statement

    importantCustomer.Adress.Street

    This example would query the database over three related tables

    No more SQL, no more joins, no more plumbing work , just elegant coding!

    In my opinion, there is no other technology right now which lets you get a complex database-driven app up and running faster, no matter if you are developing desktop apps or for the internet.

  • Natrium

    LINQ is a really sweet feature!

    There’s also jLinq (for Javascript): http://hugoware.net/Projects/jLinq
    and PHPLinq (for PHP): http://www.codeplex.com/PHPLinq

    If you are used to work with Linq in .NET, it is really neat to have the same functionallity when working in another language.

  • JD

    Very interesting, how does this compare with “traditional methods” in terms of performance?

    • Arman Mirkazemi

      To be honest, I’ve never profiled Linq vs my own code. Mainly because I can’t be sure of how optimized my code is. However I know that one of the reasons for wanting to use Linq instead of your own code is because Linq code is generated according to the statement and therefore is meant to be really optimized.

    • Skunkie

      You can “abuse” LINQ in a way that will have a massive negative impact on performance. Watch for the following issues:

      1. Only query data you need. Of course it is nice and simple to request whole object graphs with a simple “generalized query” when you only need certain data. So use anonymous types to custom shape your data.

      2. Watch out for implicit lazy loading. Everytime you loop over an object of type IQueryizable you make a full round trip to the database even if the same data has been queried before. So eager load data by use of the ToList() function fpr example.

      3. Keep an eye on the generated SQL. The generated SQL is nice and crisp a lot of times, but there are occasions where you can optimize.

  • Trixz

    LINQ is exiting, but in real life project it still suffers from poor preformance. The easy joins suddenly gets complex if you don’t want the LINQ to get data more then one-two “jumps” away from main table.

    It is possible to define linq to do these gets, but you then end up in a much more complex query then those of SQL.

    The lambda expressions is not LINQ unique. You are able to do a lambda-query on a generic list etc, so LINQ is the ORM mapping of the database.

    Notice also that my exp lies with LINQ to SQL (will not be more developed and will not be supported in future) and LINQ to Entities is much powerful. I think they are on the right track, and the expressions you can write is awsome and powerful, but it will take some time for the technology to mature.

    Another “hard” part to learn is how to handle the datacontext and it’s lifecycle. Since the datacontext flags all entities as “changed”, “new”, “deleted” etc, it’s easy to get “data not valids” messages.

    My word of advice is to look into LINQ but do not rely on it’s preformance on your next project.

    • http://www.wdonline.com Jeremy McPeak

      Performance is in the eye of the beholder. Sometimes a performance hit is acceptable; sometimes it isn’t. It’s up to the developer to decide what is acceptable and what isn’t.

      If I have a project that I have a hard deadline on and I don’t have time to “do it correctly”, I’m going to opt for the slower performing, yet more development-time efficient solution. Yes, LINQ is currently the slower option, but performance is better with .NET 4 (how much so, I do not know. I have yet to test it).

      As for LINQ to SQL, it is still supported, and all indications point to continued support. It’s adoption rate was high, and there is demand for new features and better performance for it. If it were dead, there wouldn’t be improvements and features added to LINQ to SQL in .NET 4. It has its place in application development, just as EF, nHibernate, and straight up ADO.NET do.

  • Steven

    awesome, very usefull tut, thanks a lot

  • http://www.jordanwalker.net Jordan Walker

    That is a great article in regards to LINQ.

  • http://sonergonul.com Soner Gönül

    I love linq

    Thanks

  • http://www.designer-depot.com yassaman

    Nice tut, Thanks alot..
    مرسی‌، دمتون گرم جناب میرکاظمی

  • Peter

    very good summary, you are great.

  • http://www.hardikshah.org Hardik Shah [Guru]

    Thanks and regards for the this article.

  • Lorenzo

    Linq: Il Sogno diventato realta. Grazie Microsoft

  • http://www.articleshive.com waqas

    outstanding simply superb article

  • http://www.czekamtu.pl/ Motoman

    That is a great article! LINQ is COOL.

  • http://www.activequote.com Richard T

    Have i missed something here – this article seems to be about 2 years late. LINQ became fully integrated with VB and C# with the release of .net 3.5 in Nov 2007. Has something new been introduced now (March 2010) ?

  • mastak

    wow.. man.. It’s a great post, I never seen such short and very informative tutorial in all known me books / posts. Thx a lot ! Try to continue in same way ;)

  • http://ignou-student.blogspot.com http://ignou-student.blogspot.com

    great article, i want to use it on my current project.

  • http://www.biggle.de Mario

    very nice summary of operators, thx!

  • Rubia

    it’s good easy ans simple. I really enjoyed.

  • Omprakash

    Thanks mate….Its a really nice article

  • Baskaran

    This is very useful. I am a newbie to LINQ, it helps me to understand the basics.

  • Raj

    Thanks bro. very useful article.

  • http://prakashm.net Prakash Mani

    Thanks buddy … Really appreciated.

  • smitha poluri

    aw. guys. thanks for the tutorial. very informative

  • abid

    It’s simply the best, g8 job. Thanks

  • grey

    it is very good tutorials!
    i waited for 4 months for rest of the LINQ tutorials but i think there isn’t gonna be anymore!

  • amardeep

    It’s simply the best, g8 job. Thanks

  • NB

    This is greate tuts,and love to see more deep about linq to XML

  • http://www.xpresscell.com KinQ

    Great tutorial….thanks a lot

  • Aamir Abbas

    Great information… I like it very much… Thanks

  • Madhu Shekar

    Thank you very much for writing this article…it’s very informative….I want Linq using to query databases, where can I find….

  • John Rob

    Thank you very much. This has become my best material on LINQ topic. You made it very simple and understandable, thanks for sharing with us. Some other good articles on LINQ I was found over internet during searching this topic which also explained very well about LINQ, URL links of those posts are….

    http://www.codeproject.com/Articles/188935/LINQ-Demo-with-ASP-NET-Web-Application

    http://msdn.microsoft.com/en-us/library/bb907622.aspx

    http://mindstick.com/Articles/9ca8fabd-49ef-4e4d-855b-74fc523d9138/?LINQ%20%28Language%20Integrated%20Query%29

    Lastly, I would like to say thanks to everyone for your precious post.

  • Oliverr

    Great tutorial and good examples. Thanks.

  • http://alinesullivan.webstarts.com/ AnikaSimmons

    I wasn’t aware of some of the information that you mentioned so I want to just say thank you.

  • uiyuiui

    yuiyuiyui

  • ranacseruet

    Nice tutorial. I have also written few linq tutorials . Hopefully they are worthy for reading also.