Wednesday, June 19, 2013

The knapsack problem and Budgetting.

I created a prototype for my budgeteer program in my last post, and now I can show you the next stage of the application.

Because the other one was implemented in PHP, html, and javascript it suffered from the fact that PHP scripts are meant to die. While I didn't have any trouble with the script timing out, it seems like a better idea to not have to worry about that at all if I don't have to. Therefore, I moved the application to it's new home in a recent project of mine.

In order to figure out if you can afford your budget there are a few thing's to take into account.

Are your fixed expenses greater than your income?

If your rent cost more than your paychecks for the month combined, you have a problem. Hate to tell you, but you simply can't afford wherever you live. If this is the case, then there's no need to attempt the calculation that would figure out what you're able to purchase from your wish list.

Are your expenses covered?

If you're going out to eat every night with friends and spending twenty dollars each time, unless you have a very nice job and a very affordable living situation, it may be a good idea to stop. In fact, if you can't afford your expenses, it makes no sense for you to even try to determine if you can afford anything on your wish list. It makes more sense to keep your wish list, a wishlist.

Finally, what from your wish list can you afford?

If you've managed to get to the point where all your expenses are covered and you can start eyeing your wishlist for possible treasures, then, and only then is your wishlist going to come into the budget calculation.

How do you do it?

So how do you determine if you can afford all these things? Well, it's a pretty well known problem in computer science called the 0-1 knapsack problem (as long as we're not looking into the date's on the budget, as that will elevate it to a job scheduling problem in addition to the knapsack problem -- which is in the works and might be the subject of my next post) that is actually an NP-complete problem. Or at least, part of it is. Luckily for us, there exist approximation algorithms as well as dynamic programming solutions to the problem.

The problem definition is this: I give you a bag that can hold a certain amount of weight, and a list of items that are heavy and have some value. Your job is to get as much value into the bag as possible without going over the weight limit. If you're curious about the solutions to the problem you can check out the Wikipedia page and read all about them. I found that Mikes Coderama was the most useful source as it was fairly easy to use his code as reference and the Wikipedia pseudo code as a guide as I coded my solution.

Now you might ask, how do you translate my budget into a knapsack problem? We simply define the following mappings:

  • Weight Limit becomes our Total Income
  • Item Value becomes the Price of any expense
  • Item Weight is also the Price of any expense
It's perfectly acceptable to have two properties be the same, if you think about it, it makes perfect sense.  If we wanted to get really fancy, then we could allow a user to order the items in their wish lists or expenses by how much they wanted them and use that as a sort of value multiplier, but to be honest, if you're turning towards budgeting software to help out with your finances, you probably shouldn't have too much of an impact on how valuable you find that extra large big mac over paying your rent. 

Let's talk high level strategy.

Our application breaks up a user's finances into three categories. Checkbook, Bills, and Wishlist. The Checkbook is the only source of Income as well as a source of expenses. Bills are expenses which we absolutely must pay no matter what. The wishlist is self explanatory. 

So first off, we deduct the bills from the income. I thought we were using an algorithm? Don't you worry, we're getting there. But if a user can't afford their bills, then they really shouldn't be spending anything else now should they? Plus, bills tend to be rather high in price, and since our knapsack problem is trying to maximize value of our bag, it's more likely that we'll drop one of our bills in favor of smaller expenses if we have to. And we can't have that since we' need to pay our bills. 

This new quantity of income - bills (let's call the resultant lambda), is our weight limit on our first knapsack problem. First? Yes, first, we're going to solve it twice. Once for our expenses, and once for our wishlist. Why you ask? Because if we can't afford our expenses, then we most certainly are not going to look into purchasing anything from our wish list. Agreed? 

So lambda becomes the weight limit on our knapsack, and we then feed all of the expenses from our checkbook into the algorithm and wait for it to churn out information. Luckily for us, the algorithm will run in O(nW) time with O(W) space where n is the number of items and W is the maximum weight we'll allow in our bag. This is pretty good for an NP problem, and plus, the number of items in our budget will hopefully be rather small considering it's a monthly budget. (And we're assuming we don't go shopping every other day) 

After we've solved the first knapsack problem we can collect the items we included in our bag, and figure out how much money is left over from lambda. Let's call this left over money lambda prime (lambda`) and say it's equal to lambda - (the sum of the values of the items in the previous bag). 

Now, we'll construct our second knapsack problem with lambda` as our weight limit and the items being the wishlist items only if we managed to use all the items in our bag. If so, we'll perform the same calculation again and have plenty of information to send back to the user. 

It works like a charm.

Sunday, June 9, 2013

Budgeteer Prototype

So I've decided to call my little project Budgeteer (Like a muskateer, yet different!) And I've only been working on it the past few days while I sit on the bus back and forth from work, but it's coming along nicely. I have a rather simple front end with no persistence between page refreshes, and a backend that performs a very simple calculation and sends it's response back. But I do like it. I've also just started playing with css3 media queries in order to detect phones. So I've used a little bit of those to make the application appear nicely on my mobile emulator. I'm using Ripple to do my emulation, and so far I'm enjoying it's injection of tiny hippos (I'm not kidding, check out the javascript log when using the extension.) It's really nice.

Anyway,without further adieu here's the results from this week:

What it looks like in a browser

What it looks like on a phone (emulated by Ripple)

The code itself hasn't made it's way to github yet, and is instead sitting on a private repository on my bitbucket account. Also, you may recognize the color scheme from my previous post about javascript caching, that's because this application is using my little library to cache results so it doesn't need to query for the same thing more than once. Of course, when you submit a request to have your budget analyzed, it's not going to prevent you because you've done the action already, but once I implement the sql side of things (which I probably will soon), I plan on having the budget for a session retrieved from the server with the cache using localstorage so it doesn't have to be done more than once. Of course I do need to checkout if local storage is preserved between page refreshes or not.

There's still a lot of variables to be done before this small application is done and doing something really cool, but for not it's just a neat little application with room for improvement. I plan on adding some more budget buddy-esk features in (dates and whatnot) and then plan on having the tool use that type of information to advise you

Thursday, June 6, 2013

Caching with javascript and HTML5 local storage


As I said in my last post, I've started working on another finance tool. While I haven't made much progress on the actual implementation of accounts and all that, I have started creating a core system to speed up the system for efficiency.

I am creating my own Javascript library to cache the results of a request to certain URLs. What happens at the core of this, is thus:

If we haven not yet cached the url we submit our AJAX request to the server, if we have cached the result, we return that instead. Sounds like a pretty simple concept to me. And it turns out, with a little bit of work it's not too hard to implement.

http://pastebin.com/drrqi8Pw

Instead of crudely pasting the code into this post, I've included it in the paste above. while this code will not run if you just plug it into a file (local dependencies on the library I'm making), you can get the general idea of how it works.

A use case can go something like this:
The user calls some function out of the library (such as getBudget) which, after making sure parameters are correctly, sends the url and a requestHandler to the apiRequest function. The url is parsed into just the important parameters and checked against the cache. If it's in the cache that object will be passed as an argument to the users requestHandler, if not, then we'll make the ajax request. Assuming we've made the ajax request, during the respond we'll pass the response JSON to the user's function first, and then we'll perform whatever operations we need in order to store it locally and mark the url we used to get it as cache-able.

You'll notice that I've included the apiSendDate in the paste as well, this is because we need to map CRUD operations (minus the R) to URLs that retrieve that data. This is abstracted away into the apiURLMap object, which (if the finance tool becomes complicated --which it will) will eventually become a function in order to map a single URL to multiple url's that would correspond to stale information.

To prove to you that this works (since I provided non-functioning code above) here's a couple screenshots of my mock prototype --please keep in mind front end is not important to me until the functionality is there--

Network traffic for first submission

S



Submitting a request by clicking the Get Items link



Submitting a second request, uses the cache





Second shot of the network requests, only the first one is there, so we've saved ourselves the trip to the server


Note: Just to let everyone know, localStorage can only store strings by default unless you override the eprototypes, that's why we stringify the data before placing it into the local storage

Sunday, June 2, 2013

Finance applications

For whatever reason, I find myself constantly thinking of making software to assist with budgeting and managing your personal finances. For some reason I feel a strong pull in that direction. Now that I've got a bit of Data Mining under my belt after studying with Xindong Wu I think it's time I applied it. Association analysis is a beautiful thing.

Also, when I first created budget buddy, it was more of an online checkbook. I've started a new project that I plan on making a far more in depth tool. I'm envisioning a php front end with the heavy lifting done by C if necessary  Why C and not something like ruby or java? Well, it might just be my general desire to create from scratch, but I really love the feel of working with C or C++. Hell, I'd love to play with Assembly if possible with this project, but that probably won't happen. Maybe some OCaml though or ML. I could use the Google App Engine and Python to create an application, but for whatever reason I've been digging on Javascript recently, and I don't mean jQuery, I mean pure Javascript, in fact I've made a couple tutorials to keep myself busy and as an exercise in explanations.

As I create the application I plan on posting here a bit with details, and as always the code will probably make it's way up onto github. More to come!