Monday, July 7, 2014

Respecting Owner Limits in Belongs To Associations

So, here's the scenario: I have a rails 4.0.4 application. This application has many shares, each share has many share_users which each have an 'approved' boolean attribute. However each share also has a user_limit. So how does one ensure that the number of a share's approved share_users does not exceed the share's user_limit?

If you just want the solution, skip down to Attempt Three.

Attempt One - Overwrite the share_user's approved=(value) method
New approved=(value) method:

 def approved=(value)  
      share = self.share  
      if value == true  
           if share.share_users.where("approved = true").count < share.user_limit  
                write_attribute(:approved, value)  
           else  
                raise Exceptions::ShareUserLimitReached  
           end  
      end  
      if value == false  
           write_attribute(:approved, value)  
      end  
 end  

Supporting share_user spec:

 describe "approved=(value) > " do  
           it {should respond_to :approved=}  
           let(:approved_share) {FactoryGirl.create(:basic_share, user_limit: 1)}  
           let!(:approved_share_user) {FactoryGirl.create(:share_user, share_id: approved_share.id, approved: true)}  
           let!(:disapproved_share_user) {FactoryGirl.create(:share_user, share_id: approved_share.id, approved: false)}  
           it "should not throw a ShareUserLimitReached error when the share's user_limit is not being exceeded" do  
                expect{share_user.approved=true}.not_to raise_error  
           end  
           it "should throw a ShareUserLimitReached error when the share's user_limit attempts to be exceeded" do  
                approved_share.user_limit = 2  
                approved_share.save  
                expect{disapproved_share_user.approved=true}.to raise_error Exceptions::ShareUserLimitReached  
           end  
           it "should allow share_users to be disapproved without issue" do  
                expect{approved_share_user.approved=false}.not_to raise_error  
           end  
      end  

Now, if you look at the code, you'll see a couple of things.

  • share_user#approved=(value) is not encapsulated. It's reaching through to the share, retrieving attributes from it and then executing upon those attributes.
  • share_user#approved=(value) raises a nasty error that then has to be handled specially everywhere it is likely to be invoked.
  • share_user#approved=(value) heavily complicates creating new share_users as it requires the share_user to already have a share at point of creation. This requires telling the relationship to retrieve it from memory as opposed to the database. More on this can be found here.
All in all, not a great solution, it's messy, doesn't quite do the job and requires modifying a lot of other code.

Attempt Two - validates_associated + validates_with
  1. Create a new validation for the Share called ApprovedShareUserCount and invoke it with validates_with.
  2.  class ApprovedShareUserCountValidator < ActiveModel::Validator  
          def validate(record)  
               #Required to ensure that the presence validation spec for user_limit doesn't trigger this validation to throw a nil error.
               return if record.user_limit.nil?  
    
               count = record.share_users.where("approved = true").count  
               if count > record.user_limit  
                    record.errors.add(:share_users, 'The share has reached its user limit, please contact your CEO or Tech Admin about this issue.')  
               end  
          end  
     end  
    
    Note: I stored my validators in a new folder; app/validators and added a custom eager loading path for it.
     class Share < ActiveRecord::Base       
          has_many :share_users, foreign_key: "share_id", dependent: :destroy  
          validates :user_limit, presence: true  
          validates_with ApprovedShareUserCountValidator  
     end  
    

  3. Invoke this new validation whenever a share_user is saved by adding 'validates_associated :share' to it.
  4.  class ShareUser < ActiveRecord::Base  
          belongs_to :share  
          validates :share_id, presence: true  
          validates_associated :share  
     end  
    
Supporting Share Spec
 describe "should ensure the user_limit is never exceeded" do  
                let!(:user_limit_share) {FactoryGirl.create(:share, user_limit: 1)}  
                let!(:approved_share_user) {FactoryGirl.create(:share_user, approved: true, share_id: user_limit_share.id)}  
                let!(:disapproved_share_user) {FactoryGirl.create(:share_user, approved: false, share_id: user_limit_share.id)}  
                subject{user_limit_share}  
                it "should be valid if the number of approved share_users is <= the share's user_limit" do  
                     should be_valid  
                end  
           end  

Supporting Share_User spec
 describe "validate_association :share" do  
      let!(:user_limit_share) {FactoryGirl.create(:share, user_limit: 1)}  
      let!(:approved_share_user) {FactoryGirl.create(:share_user, approved: true, share_id: user_limit_share.id)}  
      let!(:disapproved_share_user) {FactoryGirl.create(:share_user, approved: false, share_id: user_limit_share.id)}  
      it "should not be valid if approving it would exceed the share's user_limit" do  
           disapproved_share_user.approved = true  
           expect(disapproved_share_user.valid?).to_not be_true  
      end  
 end  

The allure of this is that it ties into the existing validation framework. Unfortunately, there is one critical problem; it doesn't work. The validations all trigger and work perfectly fine! But the share's validation is triggered against the database, and does NOT use the share_user in memory that triggered it.

So to use this solution, the share_user would have to be saved to the database, thus defeating the entire purpose.

Attempt Three - Share#respect_share?(share_user) + ShareUser#validates_with custom validator
Attempt two was definitely far more elegant then attempt one and was certainly moving in the right direction. So how do we build on that with something that respects encapsulation, can be easily expanded on and that uses the existing validation framework?


  1. Create a public method on Share that determines if a given share_user would violate the share's rules.
  2.  def respect_share?(share_user)  
               #Ensure the share's user_limit is respected.  
               if share_user.approved == true  
                    count = self.share_users.where("approved = true").count  
                    #This assumes that in the event this returns true, the share_user will be added to the share.  
                    #Thus it must count up, so that a share with 10 users and a 10 user limit will return false.  
                    if count+1<=self.user_limit  
                         return true  
                    else  
                         return false  
                    end  
               end  
               #This method can be expanded as the share grows.  
          end  
    
    At this point we only care about respecting the user_limit. However as you can see this can be easily expanded by breaking out the if statement (and any additional if statements) into supporting private methods.
  3. Create a custom validator for share_user that invokes the owning share's respect_share? method.
  4.  class RespectShareValidator < ActiveModel::Validator  
          def validate(record)  
               #Required to ensure that the presence validation spec for share_id doesn't trigger this validation to throw a nil error.  
               return if record.share_id.nil?  
               #This simply calls the record's owning share and then passes the record to it.  
               if record.share.respect_share?(record) == false  
                    record.errors.add(:share_users, '<Error Message Here>')  
               end  
          end  
     end  
    

Supporting Share Spec
 describe "Public Methods > " do  
      describe "respect_share?(share_user)" do  
           let!(:user_limit_share) {FactoryGirl.create(:share, user_limit: 1)}  
           let!(:approved_share_user) {FactoryGirl.create(:share_user, approved: true, share_id: user_limit_share.id)}  
           let!(:disapproved_share_user) {FactoryGirl.create(:share_user, approved: false, share_id: user_limit_share.id)}  
           it "should return false if the share's user_limit would be exceeded by saving the passed in share_user" do  
                disapproved_share_user.approved = true  
                expect(user_limit_share.respect_share?(disapproved_share_user)).to be false  
           end  
      end  
 end  

Supporting Share_User Spec
 describe "RespectShareValidator" do  
      let!(:user_limit_share) {FactoryGirl.create(:share, user_limit: 1)}  
      let!(:approved_share_user) {FactoryGirl.create(:share_user, approved: true, share_id: user_limit_share.id)}  
      let!(:disapproved_share_user) {FactoryGirl.create(:share_user, approved: false, share_id: user_limit_share.id)}  
      it "should not be valid if approving it would exceed the share's user_limit" do  
           disapproved_share_user.approved = true  
           expect(disapproved_share_user.valid?).to_not be_true  
      end  
 end  

Unlike Attempt Two this one actually works. This is because the current in-memory representation of the share_user is passed to Share#respect_share? and the method knows that it is dealing with an in memory model instead of a database record. Now while this knowledge isn't ideal, it is much cleaner and functional then either of the other two attempts.

The existing validation mechanisms are used, and there are no custom errors to be caught. This means that the majority of existing code will function without issue. On top of this, because all of the code necessary to determine if a Share would be broken is contained within the Share class itself, this solution doesn't violate encapsulation.


Sunday, April 20, 2014

The Application of Algorithms

Since January I've been taking an Algorithms class from Princeton. Through the marvel that is modern education and the internet, it is possible to learn such a complex subject from such a prestigious university for free. The pusher of this most elucidating drug is Coursera, and the course work is delightfully complicated; especially since it required dusting off my Java foo which had been withering away for the better part of two years now.

Unfortunately, there is one terrible catch;

"I will not make solutions to homework, quizzes or exams available to anyone else. This includes both solutions written by me, as well as any official solutions provided by the course staff. " - Coursera's Honor Code (EULA) [Emphasis Mine]

As you might imagine, this complicates the issue of sharing one's recent work. It is also why this post isn't full of GitHub links to complicated implementations I'm proud of.

Part One of this Algorithms class covers a fairly basic set of algorithms;

  • The foundations of algorithms
    • Speed, memory, order of growth, practical observations and jargon.
  • Union Find
    • A basic connection finding algorithm, not terribly powerful on its own, but a fundamental building block in algorithm theory and a sub routine in many more powerful algorithms.
  • Basic Sort Algorithms
    • A quick summation of the basic sorting algorithms: selection sort, insertion sort and shell sort.
  • Stacks & Queues
    • A more in depth look at the algorithms behind these basic data structures, this section looks at some fundamental OOP concepts with a Java flair.
  • Advanced Sorts
    • Quicksort, Mergesort and their various implementations. This section is really where the class gets good and starkly illustrates the strengths of good algorithms and the trade offs necessary to make them.
  • P-Queues and Symbol Tables
    • The opener to BSTs and Trees as a whole. As with all things in this class, the fundamental data structures beneath a common concept are explained in detail before the main data structure. In a lot of ways it feels like an opening act before the people you paid to see show up.
  • Advanced Trees
    • Balanced search trees and geometric applications of BSTs, specifically 1d,  Kd-trees and line interception, detection, and sort algorithms. This lesson is a real mind bender as it hints at the fundamental basis for machine vision technology.
  • Hash Tables 
    • The last week is spent covering hash tables, since they're really the ultimate form of Symbol Tables. Time is devoted to the core elements of hash table creation, the actual hashing and subsequent sorting and searching. This lesson actually prompted me to go looking through MRI Ruby's implementation of hashes.

While the lectures and assignments have prompted me to think in new ways, this part of the course left me feeling like the knowledge was somewhat difficult to apply. The abstractions discussed above, save maybe the sorting algorithms, are quite far removed from real world problems. The class' assignments were quite good at relating these algorithms back to real world problems, but after turning each one in, I found myself pondering how I could sneak the algorithm into a rails project in some nifty manner.

While Rails is used for many things, the ease with which one can build a simple web application is without a doubt its main strength. Unfortunately, there just aren't many problems in the web-app sphere that lend themselves to these basic algorithms. Sure sorting of data is important, but your database is far better at it then you. There just aren't many nifty ways to use the algorithms discussed above in most Rails apps, the algorithms are too low level. They deal with memory optimization, connectivity and data management, things that are all abstracted away from you by either Ruby, Rails, your database or the inherent nature of the MVC framework. In this situation, the strength of Rails is its greatest weakness.

The first part of the algorithms class ended in the middle of March, but that isn't where this story ends. There are more delightful algorithms to explore, and part two began a week later covering some of the more complicated, and applicable, algorithms used today.


  • Undirected and Directed Graphs
    • Graphs are one of my favorite abstractions. A graph is nothing more then a point of vertices and lines connecting them. The beauty of this structure is that it is flexible enough to represent anything from roads, to social networks, to points of failure in a circuit board. This first lesson focuses on building, searching, sorting and generally using graphs.
  • Minimum Spanning Trees & Shortest Path
    • This lesson guides the student into edge weighted graphs, and the big name algorithms, Kruskal and Prim, that dominate them. Another problem type is introduced, the 'Shortest Path Problem'. The core of problems is that they're a transferable abstraction that can be used to solve numerous different real world problems. The lesson wraps up with distinguishing cyclic vs a-cyclic graphs, Dijkstra's algorithm and the double edged sword of negative weighted edges. All in all this lesson series was my favorite.
  • Max Flow & Radix Sorts
    • Maximum Flow is another problem type, specifically it is the question of how much of something can transit a network from point A to B. As with all problem types, it transfers seamlessly from rail networks to phone lines to the internet itself. The answers to maxflow problems lie in the Ford-Fulkerson algorithm, which as you might've guessed, sits upon a graph. Radix Sorts are a series of different sorts used to manage Strings. Which despite their fundamental nature in all things programming, are quite a complicated construct to work with.
What comes next, I won't know for a little while, but this second part of the class has thus far proven itself to be much more exciting. Unlike the first course, the algorithms discussed above do not deal primarily with memory management or other low level concerns. Instead they leverage a mastery of these low level concepts to solve real world problems. This makes them prime candidates for being integrated into a Rails application. Yet, where does one go to find a problem that needs solving? Perhaps a mock travel agency with route planning using Dijkstra's algorithm? Maybe a game of some kind that uses Ford-Fulkerson to determine the maximum possible score?

Beyond the question of how to display my new found knowledge, there is a subtler subject to discuss; speed and clarity of reasoning. Many times a programmer will find themselves pondering whether they've determined all possible edge cases. Or whether an edge case exists in the first place. This process can consume hours, sometimes days, depending on how it is gone about. An uncaught edge case can lead to a subtle bug that only ever materializes once or twice in a half million queries. Working with these algorithms has improved my ability to see the possibilities of a given situation. That is the greatest boon of these classes. Sure the knowledge is nice, but being able to look at a problem and quickly make heads or tails of it, determine a solution and then work through the finer points of that solution is the real reward.

I highly recommend anyone who hasn't taken such a class to seek one out. The skills you'll develop are well worth the time.