Posterous + Latest Updates

Welcome to "All Things Tagxedo", the official Tagxedo blog now hosted by Posterous!  Now you get all the goodies that comes with a "real" blogging platform -- Search, comments, tags, RSS, and more.

I have written about 10 posts in the temporary "blog", which have now all been reposted.

The following is a list of recent updates since I last posted:

(1) Make a Tagxedo, Quick! -- If you haven't been to the homepage lately, I encourage you to do so.  Now you can quickly create a Tagxedo in mere seconds!  Enter a URL, Twitter ID, delicious ID, news keywords, search terms, or an RSS-ready URL (e.g. a typical blog), hit enter, and voila!

More specifics about what data is used by Tagxedo to create the cloud:

URL -- the webpage
Twitter ID -- latest 100 posts.
Delicious ID -- top 100 delicious tags (including frequencies).
News keywords -- Google News RSS feed
Search keywords -- Bing search RSS fee
RSS -- the corresponding RSS feed of the URL

(2) [Experimental] HTML cloud -- while Silverlight is required to build Tagxedo, you can now create a clickable tag cloud that is solely based on HTML and Javascript.  Go to "Save", figure out the desired dimensions of the tag cloud, and save both an image (must be named "tagxedo.jpg") and the corresponding HTML ("tagxedo.html").  Now just open the HTML and you have a fully working clickable Tagxedo that everyone can see!

I am not sure what to do with this yet, hence this is just an experiment.  Though this sounds nice in theory (no plugin required), the image is at least 2X larger in size than the corresponding Silverlight player, and the words don't animate.  Is this acceptable?  Useful?

There are more ways to create HTML clouds.  For example, (a) use only plain text and CSS magic (only works with standard web fonts and "horizontal" orientation), and (2) use Html5 Canvas + SVG font.  The killer problem (even for (a)) is that font rendering is inconsistent across browsers -- and HTML5 and better web standard won't solve this problem -- and when the words are off, they may either create holes or overlaps in the word cloud, potentially turning a visually stunning cloud into an eyesore.

I don't know, I'm still looking into this.  I'd like to get some feedback on this.

(3) Custom Shape -- you can continue to edit a shape after you have "accepted" it and used it.  Yeah, sometimes the shape looks good until you put it in action.

(4) Hard Boundary -- now you can specify whether the boundary of the shape is soft (the default) or hard.  Hard boundaries cannot be violated, which at times helps making the shape stands out more.  The disadvantage is that only words that exactly fit will be accepted, hence potentially leaving some regions completely unfilled.

That's it for now.

[Repost 4/30/2010] - Better Input, Better Shapes, and More Polishing

I am releasing yet another round of improvements to Tagxedo. There are two major areas of improvement that I'm sure will make some of you happy -- text input and shapes.

(1) Text Input

Phrases - you can now specify a phrase by using tilde (~) as a connector. For example, San~Francisco. Tilde is supported if you use the "enter text" or the "load file" methods, but not the "load webpage" method.

Punctuations and Numbers - I added this opt-in feature and I briefly mentioned it in the previous blog post. You have pretty fine control over what punctuation marks to skip or to keep (e.g. skip punctuations except "./:@" to preserve URLs and email addresses).

Remove common words - also known as "stop words", enabled by default (option to turn it off).

Combine related words - also known as stemming. Stemming is turned on by default, but users can turn it off.

Combine identical words - by default, no word appears more than once, but instead their sizes reflect the frequencies of their occurrences. However, if the "combine identical words" feature is turned off, each occurrence of the same word will show up once. This doesn't sound very exciting until you combine it with the next feature...

User-specified frequencies - you can now specify the frequency of individual words using the "frequency modifier" feature. What you do is to write the word in this format: Word:Frequency. For example: "New~York:8363710 Los~Angeles:3833995" (population data is from Wikipedia). The frequency must be an integer number at least 1 and at most 999,999,999. If you are concerned over conflict with existing punctuation, you can change the frequency modifier to something else (for example, "::" or ":=").

With all these flexibilities, what you can do is up to your imagination. For example, try "Happy~Birthday!:5 Love Love Love ... (150 times)" with the "combine identical words" option turned off, the "punctuations" option turned on...

(2) Shape

By far the most powerful feature of Tagxedo is its shape handling. Tagxedo can fill words nicely into shapes, from simple geometries to complex shapes (e.g. the soccer ball and the UCLA logo shown in the gallery). As I played with Tagxedo more and more, I found the urge to try it on even fancier pictures -- portrait of people. One of the Tagxedoes I created -- President Abraham Lincoln's famous portrait -- was so vivid and realistic that I literally froze in admiration when I first made it. I kept on trying different portraits (yes, I am addicted to my Tagxedo as well). Originally, the process was far from trivial. I need to load the image into an image editor, tweak it, threshold it into black-and-white, save it, load it into Tagxedo... Even then, not all images are satisfactory. And without a high-quality shape input, Tagxedo cannot do much.

So I invested a great deal of effort to make the shape making process as easy as possible. As I mentioned in the last blog, I created a "Preview" window for users to make adjustment to the threshold and blurriness to improve the robustness of the shapes. However, it still did not always produce good results, since there is a single threshold and a single blurriness, but the best result can sometimes only be obtained by using different threshold values (and blurriness) at different places. Of course, one can always resort to Photoshop to sanitize the image before passing it to Tagxedo, but I was not willing to go that route. I want something that my mom can use!

Hence, the latest round of improvement, which I call Commit Brush. You make adjustment to the threshold and blurriness, and then you *paint* a part of the image that you feel comfortable with, either with the black brush, the white brush, or the "capture" brush (which captures what is currently drawn). Adjust some more, and paint some more. Adjust some more, and paint some more. The area that is painted is called "committed", and Tagxedo simply records the pixel values at the moment of first painting. But if commitment sounds scary to you, there is undo and redo to help you out if you made a mistake.

I know what I described may sound strange or unusual, but it is really easy to do. Just try it. Perhaps I'll make a video demo out of it later. The end result is that I am now able to easily turn tricky images into usable shapes, making Tagxedo so much more fun!

That's it for now.