Andrew Glover

President, Stelligent Incorporated

Speaking at the following events:
 

Can you dig it man? ()
This is syndicated content from http://thediscoblog.com
 
[Tue, 07 Sep 2010 19:53:25 +0000]

Twitter turned off basic authentication. Finally. Henceforth, you can’t log into Twitter via its API in the traditional sense; accordingly, the following code, which uses Twitter4J isn’t valid any longer:

Twitter twitter =
    new TwitterFactory().getInstance("some user", "some password");

If you try that these days, you should see a nasty JSON message as a response stating along the lines of {"code":53, "message":"Basic authentication is not supported"}.

Instead, Twitter now requires OAuth for authenticating requests, which isn’t so bad; however, if you intend to use Twitter4J, things get complicated quickly namely because the current examples listed don’t actually work. Because it’s my bag, I aim to set the record straight though.

OAuth isn’t terribly complicated; nevertheless, if you read the various documents related to it, you’ll most likely end up confused. There are various forms of OAuth and things are different depending on desktop or mobile or web applications. In short, however, OAuth basically means two things:

  • applications don’t need to store a login and password
  • apps now delegate authorization to a trusted location — i.e. twitter

It’s all done with various tokens that are traded. Accordingly, the first step to get working with OAuth for web applications is to register your application with Twitter. You’ll need to provide a few pieces of information — key is a callback URL, which can be changed at runtime.

You’ll be given a few datums in return: namely a consumer key and a consumer secret. You’ll need those to get things started. Obviously, don’t share the secret.

Next, you’ll do two things — ask the user to sign into Twitter (in this step you’ll send some information to Twitter) and then when the user grants you permission, Twitter will invoke your callback URL. At this endpoint, you’ll need to grab another token. Then you’ll have the credentials to act upon a user’s behalf going forward.

I’m going to demonstrate this with a simple web application written with The Play Framework, which is a nifty Java based framework similar in ways to something like Rails or Grails. One thing I particularly like about Play is its ability to get a web application up and running without a domain model. This is distinctly different than Grails, which is definitely a fancy, rapid web application development framework, but which stresses a model first. Play seems to stress controllers upfront with less emphasis on domains. Thus, with Play, I can quickly demonstrate a two step OAuth flow without having to worry about a model.

As Play is a Java framework (it does leverage Groovy under the covers in some places), you end up writing everything in Java. Endpoints are written in classes that extend Play’s Controller type and are methods that begin with public static void. Thus, my first endpoint is dubbed login, which is invoked after a user clicks a link asking them to log into Twitter:

public static void login() {
 Twitter twitter =
     new TwitterFactory().getOAuthAuthorizedInstance("r4...w", "j4...2");

  try {
    RequestToken requestToken = twitter.getOAuthRequestToken(
           "http://localhost:9000/application/callback");

    session.put("requestToken_token", requestToken.getToken());
    session.put("requestToken_secret", requestToken.getTokenSecret());
    redirect(requestToken.getAuthorizationURL());
   } catch (Exception e) {
    e.printStackTrace();
   }
}

As you can see above, a TwitterFactory instance is created with my consumer key and secret. Then, a RequestToken instance is obtained and in doing so, I pass in my own callback URL (http://localhost:9000/application/callback), which Twitter will invoke after a person grants access. Lastly, two pieces of information is placed into Play’s session object, which isn’t a typical Servlet Session, but really a cookie. Those two pieces of information will be required when things get transferred back to your web application. Lastly, the browser is then redirected to an authorization URL on Twitter’s website.

The callback URL invokes the following endpoint:

public static void callback(String oauth_token, String oauth_verifier) {

 Twitter twitter =
    new TwitterFactory().getOAuthAuthorizedInstance("90...2", "3ee..");
 AccessToken accTok = null;
 try {
    accTok = twitter.getOAuthAccessToken(
    session.get("requestToken_token"),
    session.get("requestToken_secret"), oauth_verifier);
 } catch (Exception e) {
    e.printStackTrace();
 }
 //... do twitter stuff....
}

Play endpoints can have parameters, which incoming HTTP parameters are bound to via name — as you can see, Twitter passes back two parameters: oauth_token & oauth_verifier. Accordingly, I only need one — the oauth_verifier, which is used in concert with the two tokens held in a cookie to obtain an AccessToken instance.

Going forward for the remainder of this session, my twitter instance is authorized — I can do things on behalf of the user who granted my application access to their account (such as update status, etc). If I chose to do things on their behalf in the future, I can reuse the required tokens. All I need to do is save the oauth_verifier for this user or the AccessToken itself, etc.

Now that you’re familiar with OAuth and Twitter4J’s APIs, go forth and build Twitter applications, baby. Can you dig it?

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Mon, 06 Sep 2010 13:15:23 +0000]

I recently caught up with Tim Berglund and had a hip conversation with him regarding open source business intelligence. Tim points out that business intelligence tools have traditionally been a high-cost part of any enterprise’s software inventory (involving lots of golf and armies of consultants); however, options have emerged that allow teams to build credible business intelligence stacks out of entirely open-source components. In this podcast, Tim talks about various tools for ETL, reporting, and analytics like Pentaho and Talend — I really enjoyed our conversation as I definitely learned a few things!

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Fri, 03 Sep 2010 12:28:41 +0000]

As I’ve pointed out before, sharding isn’t for everyone, but it’s one way that relational systems can meet the demands of huge data. For some shops, sharding means being able to keep a trusted database like MySQL in place without sacrificing data scalability or system performance. In this installment of the Java development 2.0 series, dubbed “Sharding with Hibernate Shards” find out when sharding works, and when it doesn’t, and then get your hands busy sharding a simple Hibernate & Spring application capable of handling terabytes of data.

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Wed, 01 Sep 2010 19:51:10 +0000]

Both MongoDB and CouchDB are document-oriented datastores. They both work with JSON documents. They both are usually thrown into the NoSQL bucket. They’re both hip. But that’s where the similarities, for the most part, stop.

When it comes to queries, both couldn’t be any more different. CouchDB requires pre-defined views (which are essentially JavaScript MapReduce functions) and MongoDB supports dynamic-queries (basically what you’re used to with normal RDBMS ad-hoc SQL queries). What’s more, when it comes to queries, CouchDB’s API is RESTful, while MongoDB’s API is more native — that is, you essentially issue a query using a driver in the code of your choice.

For example, with CouchDB, in order to insert some data, I can use a tool like Groovy’s RESTClient and issue a RESTful post like so:

import static groovyx.net.http.ContentType.JSON
import groovyx.net.http.RESTClient

def client = new RESTClient("http://localhost:5498/")
response = client.put(path: "parking_tickets/1234334325",
  contentType: JSON,
  requestContentType:  JSON,
  body: [officer: "Robert Grey",
         location: "199 Castle Dr",
         vehicle_plate: "New York 77777",
         offense: "Parked in no parking zone",
         date: "2010/07/31"])

Note, in this case, I have to delineate a ID for this parking ticket (1234334325) (I can, incidentally, ask CouchDB for a UUID too by issuing an HTTP GET to the /_uuids path).

If I wish to find all tickets issued by Officer Grey, for example, I must define a view. Views are simply URLs that execute JavaScript MapReduce functions. Accordingly, I can quickly code a function to grab any document whose officer property is “Robert Grey” like so:

function(doc) {
  if(doc.officer == "Robert Grey"){
    emit(null, doc);
  }
}

I have to give this view a name; consequently, when I issue an HTPP GET request to that view’s name, I can expect at least one document:

response = client.get(path: "parking_tickets/_view/by_name/officer_grey",
        contentType: JSON, requestContentType: JSON)

assert response.data.total_rows == 1
response.data.rows.each{
   assert it.value.officer == "Robert Grey"
}

In summary, with CouchDB, I can’t quickly issue an ad-hoc RESTful call to obtain some bit of information — I must first define a query (aka view) and then expose it to the outside world. In contrast, MongoDB works much like you’ve been used to with normal databases: you can query for what ever your heart desires at runtime.

For example, I can add the same instance of a parking ticket using MongoDB’s native Java driver (there are better options for working with MongoDB, by the way) like so:

DBCollection coll = db.getCollection("parking_tickets");
BasicDBObject doc = new BasicDBObject();

doc.put("officer", "Robert Grey");
doc.put("location", "199 Castle Dr");
doc.put("vehicle_plate", "New York 77777");
//...
coll.insert(doc);

I can subsequently find any ticket issued by Officer Robert Smith by simply issuing a query on the officer property like so:

BasicDBObject query = new BasicDBObject();
query.put("officer", "Robert Smith");
DBCursor cur = coll.find(query);
 while (cur.hasNext()) {
   System.out.println(cur.next());
 }

Thus, while both document-oriented datastores have some similarities, then it comes to querying, they are vastly different. CouchDB requires the usage of MapReduce while MongoDB allows for more dynamically oriented queries (MongoDB also supports MapReduce). Can you dig it?

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Tue, 24 Aug 2010 14:48:44 +0000]

Cédric Beust has an interesting blog post entitled “Clojure, concurrency and silver bullets” where he takes issue with the notion that Clojure can yield code that

is multithread safe and it will automatically scale.

Cédric goes on to state that the concurrency problem doesn’t need a new language as

hundreds of thousands of lines written in C, C++, C#, Java and who knows what other non functional programming languages are running concurrently, and they are doing just fine

In fact, Cédric is quick to point out that Java already has added libraries (in the form of java.util.concurrent) that facilitate easier concurrent coding — and I don’t disagree with him. What’s more, he goes on to point out that Actors aren’t the end-all and be-all of concurrent programming — he even points out an excellent discussion regarding Actors on Stephan Schmidt’s blog entitled “Actor Myths” which is loaded with a fruitful discussion worthy of a close read.

I tend to agree with both Cédric and Stephan — there are no silver bullets which will kill the concurrency werewolf. Yet, I’d like to point out a few things regarding concurrency and specifically actors that might shed some light on why people are espousing something like Clojure and why the Actor Model has gained some mind share.

First, as Herb Sutter noted in his article entitled “The Free Lunch is Over: A Fundamental Turn Toward Concurrency in Software” obtaining an appreciable speed up in application performance requires taking advantage of multi-core chip architectures, which for myriad applications running today isn’t happening. That is, when these applications were written, concurrency wasn’t necessarily tackled, because let’s face it: for the average Joe, thread programming can be difficult to get right.

Accordingly, I suspect that the “thousands of lines written in C, C++, C#, Java and who knows what other non functional programming languages [that] are running concurrently” today were written that way on purpose. These programs were written with threads by smart people. Yet, I’m willing to bet that even those programs have subtle bugs that might not have shown up yet.

These applications (and the authors of them) aren’t going to see things scale up — that is, witnessing a performance increase by throwing better chips or more memory at them (like we could do in the past) won’t help. These applications will instead, need to start running on multi-core chips, where they can begin to take advantage of parallelism (if written correctly to use them!).

Second, threaded programs aren’t terribly difficult to write — no one disputes that — what’s difficult is to get them written correctly. Let’s face it, most testing strategies today rely on deterministic behavior: “given foo, then bar should be 23″. But threads and those nefarious bugs that creep in when shared state and mutability butt heads add inconsistency to this mind set. The phrase “given foo, then bar should be 23″ doesn’t always hold true all of a sudden. Sometimes bar is 89 and other times things blow up or worse lock up and bar is doomed.

Thus, people have started evaluating alternate ways to leverage threads without actually using threads directly because they can be hard to use correctly. If you haven’t read Edward A. Lee’s paper “The Problem with Threads” then go read it now. Mr. Lee does an excellent job of pointing out that our programming model based upon threads doesn’t

vaguely resemble the concurrency of the physical world

Yet, he makes a subtle statement regarding potential solutions that I’m sure Cédric would appreciate:

We should not replace established languages. We should instead build on them.

For some developers the java.util.concurrent library will be good enough, but for others, the Actor Model, which essentially hides locks and synchronized blocks and rather than sharing variables in memory, leverages a mailbox that effectively separates distinct processes from each other. And as it turns out, you can start using Actors in Java quite easily via a number of libraries.

Finally, I suspect that for some people, Clojure and by relation functional programming’s manifestation of concurrent programming is easier to grasp as they resemble “concurrency of the physical world.” For me, Actors embody concurrency in a concrete manner: I can visualize a solution more easily.

There are no silver bullets in software development. Thus, the concurrency werewolf won’t be slain easily; however, the options available to subdue the beast are manifold and those options that provide a concrete model of parallel programming, in my opinion, will be more successful than those that don’t.

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Fri, 20 Aug 2010 14:26:50 +0000]

I recently had the opportunity to chat with Stu Halloway (the author of “Programming Clojure” and the CTO and co-founder of Relevance) about, as you can probably guess, Clojure.

Briefly, Clojure is a “dialect of Lisp” and “predominantly a functional programming language” and thus, has a lot of smart people excited. As Stu himself states in the podcast, Clojure “unleashes the power of the JVM” and (in my interpretation of his words) allows a singular focus on solving a problem. That is, Clojure facilitates expressing the essence of a solution with elegant and maintainable code.

I must admit, I’ve been a bit of a skeptic of Lispy languages. I guess the fact that I had to learn and program some Lisp for a CS course in college has left a veritable scar on my conscience. You see, back then, C++ and this up and coming slow language for the web, dubbed Java, were “hot” and Lisp wasn’t even on the map of “cool” (at least for the people and companies I was hanging out with). Stu and the surrounding community’s excitement and passion for Clojure, however, has me re-engaging Lisp. I’ve even been reading Stu’s book!

If you’re curious about Clojure, I highly recommend listing to Stu — he’s a super interesting person and his opinions on Object-Oriented programming, Patterns, and languages in general are quite interesting.

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Mon, 16 Aug 2010 21:03:15 +0000]

Sid Anand, who writes the Practical Cloud Computing blog, has a series of posts entitled “SimpleDB Essentials for High Performance Users” in which he outlines a set of best practices and conventions for effectively leveraging SimpleDB. If you are using SimpleDB or are planning to, I highly recommend reading his points as they are super hip. Check out:

In particular, he advocates a form of sharding. That is, rather than putting all data into one SimpleDB domain, he recommends splitting domains up into small chunks so as to increase throughput. This makes a lot of sense; what’s more, sharding in this case isn’t terribly dangerous as SimpleDB doesn’t support cross domain queries to begin with and id management is up to an application anyway. Lastly, there are limits to the amount of space you can store in a domain; thus, sharding can facilitate growth nicely.

While not an entry in the aforementioned series, his article entitled “SimpleDB Performance: 5 Steps to Achieving High Write Throughput” is excellent too. Don’t forget to check out my two articles on SimpleDB:

Finally, I highly recommend reading Werner Vogels’ (the CTO of Amazon) “Eventually Consistent – Revisited” as it provides a base of knowledge for what’s behind SimpleDB.

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Fri, 13 Aug 2010 13:02:27 +0000]

I’m excited to announce that IBM developerWorks has launched a new series of podcasts hosted by yours truly. These podcasts feature technical discussions with various (opinionated) luminaries on a diverse set of subjects ranging from Git to Clojure to Griffon and even .NET (just to name a few!).

The first podcast in the series is a discussion about Git with my friend, Matthew McCullough. I had the pleasure of attending a presentation Matthew gave on Git at a NFJS conference; I was thoroughly impressed with his passion and depth of knowledge regarding how to get started and use Git effectively.

I think you’ll find, as I did, his excitement regarding Git is infectious — if you don’t want to start using Git after listening to Matthew, then you probably never will! So what are you waiting for? Have a listen!

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Wed, 11 Aug 2010 20:18:25 +0000]

There’s an interesting thread of comments related to a blog post by Stephen Colebourne, who is giving a talk at this year’s JavaOne entitled “Next Big JVM language.” In particular, he and others note that the Fantom language could be the answer (I find this interesting as Fantom really wasn’t even on my radar. Until now.). Moreover, many of the threads claim Scala to be the next big language. It seems people still prefer static typing over dynamic-ness. Either way, I got the distinct impression, based upon those individuals that left comments, which, by no means reflects the community at large, that Groovy isn’t it.

Principally, the arguments against Groovy can be summarized as its lack of performance (compared to Scala, for instance). Not to be outdone, a few folks brought up Groovy++ (which attempts to add a bit of static-ness to Groovy ostensibly to increase performance). Nevertheless, the comments are quite interesting to read if for anything that Fantom is gaining mind share perhaps at the cost of other more mainstream alternatives like Groovy.

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 
[Mon, 09 Aug 2010 14:56:13 +0000]

Modeling domain objects for almost any type of application is a breeze using a relational framework like Grails, but what about SimpleDB? This article published by IBM DeveloperWorks entitled “Cloud storage with Amazon’s SimpleDB, Part 2″ shows you how to use SimpleJPA, rather than the Amazon SDK, to persist objects in SimpleDB’s cloud storage.

SimpleJPA automatically converts primitive types to the string objects that SimpleDB recognizes; what’s more, SimpleJPA also handles SimpleDB’s no-join rules for you automatically, making it easier to model relationships. The bottom line: SimpleJPA can help you access significant, inexpensive scalability quickly and easily.

Don’t forget to check out the previous article in this short series on SimpleDB of this article, entitled “Cloud storage with Amazon’s SimpleDB, Part 1” — and while you are at it, check out my other articles related to NoSQL and the like!

Looking to spin up Continuous Integration quickly? Check out www.ciinabox.com.

 

Photos
Videos
Help
Visit the FAQ Page