Wednesday, June 19, 2013

The knapsack problem and Budgetting.

I created a prototype for my budgeteer program in my last post, and now I can show you the next stage of the application.

Because the other one was implemented in PHP, html, and javascript it suffered from the fact that PHP scripts are meant to die. While I didn't have any trouble with the script timing out, it seems like a better idea to not have to worry about that at all if I don't have to. Therefore, I moved the application to it's new home in a recent project of mine.

In order to figure out if you can afford your budget there are a few thing's to take into account.

Are your fixed expenses greater than your income?

If your rent cost more than your paychecks for the month combined, you have a problem. Hate to tell you, but you simply can't afford wherever you live. If this is the case, then there's no need to attempt the calculation that would figure out what you're able to purchase from your wish list.

Are your expenses covered?

If you're going out to eat every night with friends and spending twenty dollars each time, unless you have a very nice job and a very affordable living situation, it may be a good idea to stop. In fact, if you can't afford your expenses, it makes no sense for you to even try to determine if you can afford anything on your wish list. It makes more sense to keep your wish list, a wishlist.

Finally, what from your wish list can you afford?

If you've managed to get to the point where all your expenses are covered and you can start eyeing your wishlist for possible treasures, then, and only then is your wishlist going to come into the budget calculation.

How do you do it?

So how do you determine if you can afford all these things? Well, it's a pretty well known problem in computer science called the 0-1 knapsack problem (as long as we're not looking into the date's on the budget, as that will elevate it to a job scheduling problem in addition to the knapsack problem -- which is in the works and might be the subject of my next post) that is actually an NP-complete problem. Or at least, part of it is. Luckily for us, there exist approximation algorithms as well as dynamic programming solutions to the problem.

The problem definition is this: I give you a bag that can hold a certain amount of weight, and a list of items that are heavy and have some value. Your job is to get as much value into the bag as possible without going over the weight limit. If you're curious about the solutions to the problem you can check out the Wikipedia page and read all about them. I found that Mikes Coderama was the most useful source as it was fairly easy to use his code as reference and the Wikipedia pseudo code as a guide as I coded my solution.

Now you might ask, how do you translate my budget into a knapsack problem? We simply define the following mappings:

  • Weight Limit becomes our Total Income
  • Item Value becomes the Price of any expense
  • Item Weight is also the Price of any expense
It's perfectly acceptable to have two properties be the same, if you think about it, it makes perfect sense.  If we wanted to get really fancy, then we could allow a user to order the items in their wish lists or expenses by how much they wanted them and use that as a sort of value multiplier, but to be honest, if you're turning towards budgeting software to help out with your finances, you probably shouldn't have too much of an impact on how valuable you find that extra large big mac over paying your rent. 

Let's talk high level strategy.

Our application breaks up a user's finances into three categories. Checkbook, Bills, and Wishlist. The Checkbook is the only source of Income as well as a source of expenses. Bills are expenses which we absolutely must pay no matter what. The wishlist is self explanatory. 

So first off, we deduct the bills from the income. I thought we were using an algorithm? Don't you worry, we're getting there. But if a user can't afford their bills, then they really shouldn't be spending anything else now should they? Plus, bills tend to be rather high in price, and since our knapsack problem is trying to maximize value of our bag, it's more likely that we'll drop one of our bills in favor of smaller expenses if we have to. And we can't have that since we' need to pay our bills. 

This new quantity of income - bills (let's call the resultant lambda), is our weight limit on our first knapsack problem. First? Yes, first, we're going to solve it twice. Once for our expenses, and once for our wishlist. Why you ask? Because if we can't afford our expenses, then we most certainly are not going to look into purchasing anything from our wish list. Agreed? 

So lambda becomes the weight limit on our knapsack, and we then feed all of the expenses from our checkbook into the algorithm and wait for it to churn out information. Luckily for us, the algorithm will run in O(nW) time with O(W) space where n is the number of items and W is the maximum weight we'll allow in our bag. This is pretty good for an NP problem, and plus, the number of items in our budget will hopefully be rather small considering it's a monthly budget. (And we're assuming we don't go shopping every other day) 

After we've solved the first knapsack problem we can collect the items we included in our bag, and figure out how much money is left over from lambda. Let's call this left over money lambda prime (lambda`) and say it's equal to lambda - (the sum of the values of the items in the previous bag). 

Now, we'll construct our second knapsack problem with lambda` as our weight limit and the items being the wishlist items only if we managed to use all the items in our bag. If so, we'll perform the same calculation again and have plenty of information to send back to the user. 

It works like a charm.

No comments:

Post a Comment