Improved Indexing in Lex.DB 1.1

Original indexing feature was rather limited. You could look for records with specified values of indexed key only. Basically it was only useful for speedy access to related entities. Like child entities via parent key.

This has now been drastically improved. The new indexing adds capabilities to query records with less, great and between operators. In addition some LINQ-like methods are available: Skip, Take and Where, helping you to limit record loading.

Let me show few examples to understand the details. Indexing declaration during initialization phase is not changed, changes in code are minimal. I just copied the code snippets from original indexing post.

public class Contact 
{   
  public int Id { get; set; }   
  public string FirstName { get; set; }   
  public string LastName { get; set; }   
  public DateTime? BirthDate { get; set; }   
  public string FullName { get { return FirstName + " " + LastName; } } 
}

And initialization:

db.Map<Contact>()
  .Automap(i => i.Id)
  .WithIndex("LastName", i => i.LastName)
  .WithIndex("FullName", i => i.FullName);

And now we’d like to get all contacts with the last name Guthrie:

var contacts = db.Table<Contact>();
var list1 = contacts.IndexQueryByKey("LastName", "Guthrie").ToList();

Or without implicitly inferring index type:

var list2 = contacts.IndexQuery<string>("LastName").Key("Guthrie").ToList();

The interesting part is the new method IndexQuery<K> that returns IIndexQuery interface. This interface is backed by an object containing indexed query parameters.
For example, if you want to query all last names starting from letter L, following query construct can be used.

The IndexQuery instance is immutable, so we may cache the query in variable:

var index = contacts.IndexQuery<string>("LastName");

// The index has low bound L (inclusive) 
// and high bound M (exclusive by default),
// so we basically make operator Between 
// "L" <= i.LastName < "M"
var query = index.GreaterThan("L", true).LessThan("M");

To actually materialize the entities from query above ToList() method is used. All relevant entities will be loaded at once in a singe read-only transaction:

var items = query.ToList();

If we don’t want to load relevant entities, but just to look at resulting last names, we can use lazy loading feature with ToLazyList() method. In this case list of Lazy<T> entity wrappers is returned. Lazy<T> will load wrapped entity only once, when you access Lazy<T>.Value property.

For your convenience indexed key value stored in Key property is also part of Lazy<T>:

var items = query.ToLazyList();

foreach(var i in items)
{
  var lastName = i.Key;
  if ((lastName ?? "").Length > 5)
  {
    // load from database
    var entity = i.Value;
    // process entity
  }
}

Sometimes you need to limit the resulting number of entities returned by the indexed query. To avoid materialization of excessive entities, limit your query with Take(int) method:

var items = query.Take(5).ToList();

In this case only five first available entities will be loaded from database.

Sometimes you’d want to implement paging. To simplify this scenario, Skip(int) method can be used:

var pageIndex = 3;
var pageSize = 50;
var items = query.Skip(pageIndex * pageSize).Take(pageSize).ToList();

This will skip first 3 pages with page size of 50 entities. The order you call Skip and Take are not important for sake of simplicity.

There is also parameterless overload for PK index query allowing you to query primary index the same way:

var index = contacts.IndexQuery<string>();

And last but not least. There are some complex logic involved in deciding whether to load entity or not based on indexed values. To simplify this scenario Filter<K>(Func<K, bool>) is available at your disposal:

var items = query.Where(i => (i ?? "").Length> 5).Take(5).ToList();

The method above will return first five entities with last name’s length greater than 5.
Similarly to LINQ, IndexQuery has Count() method to determine the number of entities without actually loading them.

To get list of primary keys for resulting entities, ToIdList() method is used. If you pass the resulting value to DbTable.LoadByKeys(), you’ll get all keyed entities loaded. That is what ToList() method does.

But be aware, all these methods above are not LINQ. Thy are just plain old CLR methods. They look like LINQ, but LINQ they are not. So don’t expect them to behave exactly like LINQ.

Feel free to submit your feedback. Sometimes one’s minute can save another’s days.

33 thoughts on “Improved Indexing in Lex.DB 1.1”

  1. Hi

    I’d like your advice or comments on some of the challenges I’m facing when trying to migrate from SQLite. Most of the challenges are either because of my use of LINQ, or because in some situations I’ve used raw SQL commands.

    For example:

    * No FirstOrDefault/FirstOrDefaultAsync. In the situations I’m having trouble recording for Lex.db, I don’t know an explicit key value (e.g. I just want the first value stored in a table) or I just want the first matching value returned, e.g.:

    return DBAsyncConnection.Table().Where(t => t.StoredFile == false).FirstOrDefaultAsync();

    I suppose I could perform a query and then limit it to a result of 1. The drawback to that approach though is that you will always return an array but I can try and figure a way around that for singleton results.

    * Searching for strings that *contain* the query string, not just exact matches, e.g.:

    return DBAsyncConnection.Table().Where(v => v.FamilyName.Contains(query)).ToListAsync();

    * Direct SQL commands are going to be tougher to match! An example of where I’m using them is because I want distinct values returned. So, for example, if I’m searching for a string, I may get 10 results, 3 of which are actually the same name and I only want to display the unique values. I solved that with SQLite with:

    return DBAsyncConnection.QueryAsync(“select distinct Name_Piece_Given from RHPersonalNameStructure where (Name_Piece_Given like (‘%’ || ? || ‘%’)) limit 5″, query);

    This is a valuable command because the database does the heavy lifting for me. In other words, if there are more than 5 results, it will only give me 5 unique results. If I had to do it in code, I’d probably have to get ALL the results and then filter out the first 5 unique ones. Getting all of the results could be a more costly operation.

    I *think* this can be done with your Filter functionality but I’m struggling to understand your examples unfortunately.

    * Sorting. I’ve occasionally used OrderBy:

    return DBAsyncConnection.Table().Where(v => v.Name_FirstnameFirst.Contains(query) || v.Name_SurnameFirst.Contains(query)).OrderBy(v => v.Name_SurnameFirst).ToListAsync();

    * Multiple matches in a single query:

    return DBAsyncConnection.Table().Where(v =>
    v.EventDescriptor.Contains(query) ||
    v.AgeAtEvent.Contains(query) ||
    v.ResponsibleAgency.Contains(query) ||
    v.CauseOfEvent.Contains(query) ||
    v.FactDetail.Contains(query)).ToListAsync();

    * Query counts:

    return DBAsyncConnection.ExecuteScalarAsync(“select count(*) from RHNoteStructure where RHEvent = ?”, ID);

    Now for non-query counts, that’s easy – just use CountAsync(). In the example given, I think that I could use the filter functionality but I don’t understand your example well enough to translate my code :-(.

    That’s it for now :-) Many thanks for any help or suggestions.

    Philip

    1. Philip, as I understand your queries – they all require full scan on underlying data. That means all records have to be loaded from storage, killing performance.

      Lex.DB can limit loaded records if you use indexed queries as in my example above. (FirstOrDefault could be simulated with Take(1) at this moment).

      Where filter can limit loaded records as well, if your define index for your text expressions.

      Table
      .WithIndex(“NameText”, v => v.FirstnameFirst +’/n’+ v.SurnameFirst)

      Table
      .WithIndex(“DetailsText”, v => v.EventDescriptor + ‘/n’ + v.AgeAtEvent + ‘/n’ + v.ResponsibleAgency + ‘/n’ + v.CauseOfEvent + ‘/n’ v.FactDetail)

      This one is for optimizing RHNoteStructure query by RHEvent key:

      Table
      .WithIndex(“RHEvent”, v => v.RHEvent)

      Later your may fast filter with:

      Table.IndexQuery(“NameText”).Where(i => i.Contains(query)).ToListAsync();

      Table.IndexQuery(“DetailsText”).Where(i => i.Contains(query)).ToListAsync();

      Table.IndexQuery(“RHEvent”).Key(ID).CountAsync();

      If you don’t mind I’ll note your feedback on github project page for next version. The idea to limit materialized data direct after loading is pretty neat.

      Thank you,
      Lex

      1. Hi Lex

        I may have made a typing mistake when trying to copy your example of how to build a complex Map entry, but the code compiles but gives this error when running:

        Cannot extract member information

        The stack trace is:

        at Lex.Db.TypeMap`1.ExtractMemberEx[K](Expression`1 prop, Expression& target, MemberUsage usage)
        at Lex.Db.TypeMap`1.ExtractMember[K](Expression`1 prop, MemberUsage usage)
        at Lex.Db.TypeMap`1.WithIndex[I1](String name, Expression`1 indexBy, IComparer`1 comparer)

        This is the entry that causes the problem:

        .WithIndex(“EventFTMatch”, i => i.EventDescriptor + “\n” +
        i.AgeAtEvent + “\n” +
        i.ResponsibleAgency + “\n” +
        i.CauseOfEvent + “\n” +
        i.FactDetail);

        Thank you!

        Philip

  2. I haven’t tested it yet but I think I’ve solved my “FirstOrDefault” challenge:

    return DBAsyncConnection.Table().Where(t => t.Xref == xref).FirstOrDefaultAsync().Result;

    becomes

    return LexDB.Table().IndexQueryByKey(“Xref”, xref).Take(1).ToList().FirstOrDefault();

    Although there may be a simpler way of getting the singleton result … sometimes my lack of understanding of C#, interfaces, etc just gets in my way.

    1. You are right. At this moment there is no native FirstOrDefault.

      But it will be available soon, since it’s quite easy to implement.
      As a temporary solution, you may use following snippet:

      public static class MyLexDbExtensions
      {
          public static K FirstOrDefault<T, K>(this IIndexQuery<T, K> query) 
          {
             return query.Take(1).ToList().FirstOrDefault();
          }
      
          public static Task<K> FirstOrDefaultAsync<T, K>(this IIndexQuery<T, K> query) 
          {
             return TaskEx.Run(() => query.Take(1).ToList().FirstOrDefault());
          }
      }
  3. Hi Lex

    You wrote in the blog post:

    “There is also parameterless overload for PK index query allowing you to query primary index the same way:

    var index = contacts.IndexQuery();”

    However, the compiler is reporting:

    No overload for method ‘IndexQuery’ takes 0 arguments.

    Philip

    1. You have to specify type of the primary key. WordPress strips the angle brackets from text, what is annoying.

      contacts.IndexQuery<int>();”

      1. I’m still getting the same compiler error. An example of my revised code is now:

        return LexDB.Table().IndexQuery().Key(ID).Take(1).ToList().FirstOrDefault();

        but the compiler is still saying “No overload for method ‘IndexQuery’ takes 0 arguments”. Looking at Lex.Db.DbTable, it looks like all of the definitions of IndexQuery require a string parameter called name?

        Philip

        1. oh, my bad. To avoid exactly this kind of misunderstanding, I named the PK version of this method just Query<K>

          So in this case it’ll be just contacts.Query<int>();

  4. Will you be blogging any more on the Lex.Db.Sync? It is mentioned lots but never really discussed. I have downloaded all you samples / unit tests but haven’t managed to get anything to actually run. I am very interested.

    Thanks

    jack

    1. I still need to update Lex.Db.Sync to use Lex.Db 1.1.

      I’m planning to write a blogpost about Db & Sync combination in few days.

  5. Is there a way to sort the index before the Take() method? I’d like to sort by ascending for example then take the top 5 records.

    1. Steve

      Something like this should work:

      LexDB.Table().LoadAll().OrderBy(I => i.Name).Take(5).ToList();

      I haven’t tested this yet – still working on converting my code over to LexDB :-).

      Philip

    2. Index is already sorted by defining expression.

      If you need to sort by another criteria, you need to load all records anyway. What basically LoadAll does.

      I see the growing IEnumerable usage of LoadAll methods. Maybe it’s time to return real enumerator?

  6. Hi, I’m using your Lex database for both a Windows Phone 8 and Windows 8 application. It will be synced with an SQL Azure database at a later date. I have used several ORM systems in the past and I have a couple of questions about how to code classes for use in Lex.
    Let’s look at a couple of simple examples. I have a project class and a project status class. Normally in C# I would have the following, where the Guid’s are the primary keys for the object.


    public class Project
    {
    public Guid Oid {get; set;}
    public string name {get; set;}
    public ProjectStatus Status {get; set;}
    }

    public class ProjectStatus
    {
    public Guid Oid {get; set;}
    public string name {get; set;}
    }

    So my first question is does Lex know that the reference to ProjectStatus in the Project class is really a reference to its primary key, i.e. Oid? Or should I be explicit and make the last property


    public Guid ProjectStatusOid {get; set;}

    And use that to construct the Project object myself? Could I have both and just use the ignore attribute on the public ProjectStatus Status property? I’m looking for your recommended best practice here.

    My second question concerns the use of collections. Let me add to the project collection a List. Say the following.


    public List {get;set;}.

    Does Lex handle this or do I need to provide a connecting class such as


    public class ProjectsAndMembers
    {
    public Guid Project {get;set}
    public Guid ProjectMember {get;set;}
    }

    And use that table to query my project members.

    Thanks your project looks very interesting so far. Are you planning a paid for version?

    Jim Smyth

  7. Hi Lex!!!
    I have created a questions table using Lex.DB,
    1. Now I have to insert almost 50k questions and options into it..Is there any easy way to this?

    2. How to get Auto Id’s for my questions so that it automatically creates next ID number one after the other..

  8. Hi Lex,

    do I see it correctly that Lex.DB does not have async methods for querying via indexed values? Should I just encapsulate that in a Task, or what would be your recommended way?

    Thanks a lot!

  9. I’m getting exception with the below code..Can you please give me a code for setting two components in one index

    db.Map()
    .Automap(i => i.Qn_ID, true)
    .WithIndex(“Selection”, i => i.Qn_SubjectID + i.Qn_TopicID);

    var index = db.Table().IndexQueryByKey(“Selection”, “Anatomy”, “Neuro Anatomy”).ToList();

  10. Hello

    We use your database to handle the offline functionality in our app which is used to record time sheets. I have an initialization that looks like this:

    dbInstance = new DbInstance(“dayEntries. dbInstance.Map().
    Automap(p => p.Id).
    WithIndex(“SpentAt”, i => i.SpentAt).
    WithIndex(“DayEntryActionType”, i => i.ActionType);

    dbInstance.Initialize();

    Later in the code I use the ActionType property to filter out dayEntries with value greater than zero:

    var index = dbInstance.Table().IndexQuery<int("DayEntryActionType");

    return await index.GreaterThan((int) ActionType.None).ToListAsync();

    Everything works fine at first but after a while and when the app is used a few times LedDb throws a KeyNotFoundException.

    Have I done something wrong?

    1. You should not mix types in index definition and index usage.

      var index = dbInstance.Table().IndexQuery<ActionType>("DayEntryActionType");
      return await index.GreaterThan(ActionType.None).ToListAsync();
      1. Sorry

        The property ActionType is in fact an integer. I have used it as an enum before but changed it for testing purposes to see if it was the source of the error.

          1. I did now manage to reproduce the error in another part of my app where I select expenses depending on date.

            I have this in ctor:
            dbInstance = new DbInstance(“expenses.db”);
            dbInstance.Map().Automap(p => p.Id).WithIndex(“SpentAt”, i => i.SpentAt);
            dbInstance.Initialize();

            This query code with query works for a while:
            var loadExpenseTask = dbInstance.Table().IndexQuery(“SpentAt”).Where(p =>
            p.Date >= startOfWeek &&
            p.Date <= endOfWeek);

            var expenses = await loadExpenseTask.ToListAsync();

            Get this error:
            System.Collections.Generic.KeyNotFoundException was unhandled by user code
            HResult=-2146232969
            Message=The given key was not present in the dictionary.
            Source=mscorlib
            StackTrace:
            at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
            at Lex.Db.Indexing.DataIndex`2.ReadNode(DataReader reader, Dictionary`2 keyMap, DataNode`1 parent)
            at Lex.Db.Indexing.DataIndex`2.Read(DataReader reader, DbFormat format)
            at Lex.Db.DbTable`1.ReadIndexes(Stream stream)
            at Lex.Db.DbTable`1.LoadIndex(IDbTableReader reader)
            at Lex.Db.DbInstance.BulkReadScope.GetReader(DbTable table)
            at Lex.Db.DbTable`1.ReadScope()
            at Lex.Db.Indexing.DataIndex`2.ExecuteToList[L](IndexQueryArgs`1 args, Func`3 selector)
            at Lex.Db.IndexQueryBase`3.ToIdList[PK]()
            at AdviseSolutions.TimeToHarvest.Infrastructure.ExpenseRepository.c__DisplayClassf.b__e()
            at System.Threading.Tasks.Task`1.InnerInvoke()
            at System.Threading.Tasks.Task.Execute()
            InnerException:

  11. I have a quiz app which loads around 3000-5000 questions to a list from the “Lex.Db” database and sets a next question from element (0) and removes the same question at element (0) after completing the question. It goes till the end..

    I’m saving the (int)no. of questions in settings, whenever the user navigates away from the quiz page, now when navigated back to the page I want to remove those number of (int no. of questions) from the new list.

    What is the easy method to achieve the above requirement?

    1. How will be the performance of the app if I use skip(int no. of questions)? or Should I use another method?

    2. Instead, should I load the list of primarykey’s and pass each key of the list to DbTable.LoadByKeys() and then while resuming skip(int no. of questions) from the list of primarykey’s?

    1. I believe that first approach will be a bit faster, because you save “locate record in datafile by given key” step.

      Using LoadAll or BulkLoad with Skip() you’ll probably achieve better performance. It is also easier to understand.

      Why do you need to load all questions in memory?

      1. This app is for revising a subject(medical) before examination. So, the user tries to complete all the questions of single subject at a fast pace.. The app is mainly for revising(self assessment), not like a test.

        Do you have any better idea?

      2. I have improved my code like this :), Hope it is really fast than before..

        if (SbQuestions.Count > 0)
        {
        SbQuestions.RemoveAt(0);
        if (SbQuestions.Count > 0)
        {
        this.NextSbQuestion();
        }
        else if (SbQuestions.Count == 0)
        {
        using (var db = new DbInstance("Questions"))
        {
        db.Map()
        .Automap(i => i.Qn_ID, true)
        .WithIndex("SubjectID", i => i.Qn_SubjectID)
        .WithIndex("TopicID", i => i.Qn_TopicID);
        db.Initialize();

        //int "QnNo" is incremented in the nextquestion()

        SbQuestions = db.Table().Skip(QnNo).Take(2).ToList();
        }
        if (SbQuestions.Count > 0)
        {
        this.NextSbQuestion();
        }
        else
        {
        this.StopQuestions();
        }
        }
        }

        else if (SbQuestions.Count == 0)
        {
        using (var db = new DbInstance("Questions"))
        {
        db.Map()
        .Automap(i => i.Qn_ID, true)
        .WithIndex("SubjectID", i => i.Qn_SubjectID)
        .WithIndex("TopicID", i => i.Qn_TopicID);
        db.Initialize();

        SbQuestions = db.Table().Skip(QnNo).Take(2).ToList();
        }
        if (SbQuestions.Count > 0)
        {
        this.NextSbQuestion();
        }
        else
        {
        this.StopQuestions();
        }
        }

        Thanks Lex, Looking forward to get support to load database directly from the installation folder.

    1. I have just checked the update details of the new version..
      Thanks for improving and supporting the database.. How can I load database from a local folder? Can you please give me an example?

Leave a Reply

Connect with:

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>