slovenia’s national budget and open data

January 21st, 2013 § 2 comments § permalink

first, a disclaimer. in light of recent political events and unrests in slovenia, i’d like to stress that this post is not meant to take any sides. i’ll merely try to point out to a project that might otherwise go unnoticed.

English: Detail from Government. Mural by Elih...

English: Detail from Government. Mural by Elihu Vedder. Lobby to Main Reading Room, Library of Congress Thomas Jefferson Building, Washington, D.C. (Photo credit: Wikipedia)

last year, i’ve spend a couple of days reading our national budget. the purpose of the exercise was to find ways to create something not unlike the famous ‘death and taxes’ infographic. i was pleasantly surprised with the fact, that our budget is actually very well designed, with fascinating inherent structure of programs and spenders, but unpleasantly not-surprised, that it was published as PDF.

to create an infographic with such complex data, that should be rebuilt every year, one needs programmatic ways to process it. so i ended up parsing the pdf, with many silly problems on the way. but it worked, and i’ve published the broken-down version for the years 2010-2012.

that was in spring, and ever since i’ve been waiting for the new government to finally publish the budget that was supposed to govern us this year, so i could compare it with the old ones. i really resent the fact that the budget was kept unpublished all throughout the legislative process. i really feel it’s an insult to the citizens.

but, they finally published it last week, and to my great surprise, they’ve really made an effort – they published detailed explanations of each section, and, ta-da-da-da, we have machine-parsable CSV files as well!

i realize it’s not perfect, but it’s light years ahead of what we used to have to deal with. so, who’s up for some info-charting now? 😉

Howto add Hacker News share button on

August 29th, 2012 § 2 comments § permalink


English: Spanish metal button circa 1650-1675,...

English: Spanish metal button circa 1650-1675, 12mm diameter. (Photo credit: Wikipedia)


Every blogger writing about technology and/or startups knows how important Hacker News is for promoting good, quality articles. The effect of being published there is comparable to more general sharing sites like Reddit and Digg and StumbleUpon.


Unfortunately, unlike it’s bigger cousins, this service is not supported by as an option for sharing buttons, after the post. Luckily we have an option to add custom sharing button, that makes it really easy to create a custom button yourself. Here’s how.


  1. In your dashboard, Go to Settings -> Sharing
  2. Click ‘Add a new service’ under ‘Available services’, a popup will show up
  3. Put ‘Hacker News’ under service name,
  4. Put “” under Sharing URL,
  5. Put “” under Icon URL, and hit ‘create share button’
  6. Drag the newly created ‘available service’ button to enabled services


and you’re done, now you have a shiny new HackerNews sharing button under every blog post. relax.




How big is the Web? [data]

August 15th, 2012 § 6 comments § permalink

Tim Berners-Lee: The World Wide Web - Opportun...

Tim Berners-Lee: The World Wide Web

I was curious about the total pageviews of the web. It turns out they are not really tracked anywhere, and that they are easy to estimate, so I did a quick analysis.

First I found two sources for ‘global total pageviews’:

  • Akamai Net Usage Index – amazing real time dashboard of part of this data. They say that every minute 3 million pageviews are spent on news sites, and 10 million on social sites. That’s friggin’ a lot of pageviews! But I wanted to know the grand total, and hopefully get some sense on where the blogs are in the picture.
  • blog post about interpolating this data from Alexa. Nice approach, but a few years old data, so I decided to repeat the process.

Alexa publishes pageviews for every site for free as a % of global pageviews. First thing to do was estimate the grand total, as described in that blog post, by looking at the published data from Wikipedia.

11,600,000,000 / 0.5% = 2320,000,000,000 monthly total pageviews on the Web

… told you it was easy 🙂 but that just means we can dig deeper. Alexa publishes the list of top million sites in a downloadable text file, so I wrote a script to go trough it, scrape Alexa pages for top 10.000 sites and store their individual traffic shares.

» Read the rest of this entry «

TypePad Blogs Get More Relevant With Zemanta Recommendations

May 16th, 2012 § 2 comments § permalink


Groucho Marx & anonymous blogging

Groucho Marx & anonymous blogging (Photo credit: Wikipedia)

I’m late to the game, probably everyone knows already, but for the record:

TypePad Blogs Get More Relevant With Zemanta Recommendations

Life just got a little easier for bloggers who use TypePad. The hosted blogging platform announced that it is integrating Zemanta’s content recommendation tools into its service, which suggests links to related stories from across the Web. Zemanta also generates in-text links to related information.


… when we started 5 years ago, we had a list of most relevant blogging platforms of all times. now all of them are our partners 🙂 it feels empowering and inspiring to make dreams happen, but you have to remind yourself of that achievement, because when you reach them, you have other dream already.

Linked Data – the best of two worlds

April 10th, 2012 § Comments Off on Linked Data – the best of two worlds § permalink

noSQL is dual to SQL - Exploring NoSQL - YOW 2...

noSQL is dual to SQL - Exploring NoSQL - YOW 2010 Melbourne (Photo credit: avlxyz)

warning: ignorant CEO rant

when oh when will the geeks realize that it’s not about the formats, but about products and customers? the tech decisions are made simpler if you have a real problem to solve. and tech standards are 90% of the time emergent from hundreds of best practices.

Linked Data – the best of two worlds

On the one hand you have structured data sources such as relational DB, NoSQL datastores or OODBs and the like that allow you to query and manipulate data in a structured way. This typically involves schemata (either upfront with RDB or sort of dynamically with NoSQL that defines the data layout and the types of the fields),…

We need this to understand how you use our service - you can take it out if you like. Cheers, your Blogspire team.


time for “Push PR”

April 7th, 2012 § Comments Off on time for “Push PR” § permalink

An Empire of Silly Statistics…A Fake War for P...

An Empire of Silly Statistics…A Fake War for Public Relations (Photo credit: Marquette University)

Ernest is entirely right – PR companies just don’t get the fact, that we don’t care what they think should interest us – that’s what it means being ‘independent’:

How PR fails at Blogger OutreachMarch 29, 2012

At some point in the last year or so, someone pegged me as an influential blogger… and then it started. A constant and never-ceasing stream of daily e-mails from various PR companies mindlessly clogging up my inbox.

It does not, however, mean that PR is dead – there clearly is a need for ‘public relations’. the need is actually much larger than it ever was, on both sides – corporate communications as much as on the receiving end – bloggers have to be current and informed, just like journalists had to be.
I believe the solution is in making PR more pushy.
As a writer, I expect the right content to come to me, I don’t want to seek it out. In that sense, I expect it to be pushy, but also highly targeted and personalized. Just like it used to be, back in the days when there was roughly as much PR professionals as there were journalists, and the two crowds well managed eachother.
as the new media grew, keeping up with targeting became impossible, and now they rely on ‘curated’ lists of thousands of bloggers, they never really looked at. I believe that’s where we at Zemanta make a huge difference – I often link PR messages from my posts, because they are recommended to me exactly when I’m writing about the topics they adress, so they actually provide value to me – I would never go look for them otherwise.
Pushy is not spammy, if done right. But there is no way you can do PR right without help from algorithms these days.

SeedTable – finally a decent overview of startups

April 1st, 2012 § Comments Off on SeedTable – finally a decent overview of startups § permalink

I love projects that make large datasets usable. This one took way to long to be done, but finally – now we can stop wasting clicks and get an executive summary of our city’s startups. 🙂

Also, I will take this opportunity to invite you all for a sneak peek at East Start Map – please let me know what we’re missing.

SeedTable Is A Stunning New Way To Interrogate CrunchBase – And Find Investors

I have a love/hate relationship with CrunchBase. On the one hand it has great information about startup tech companies. On the other hand, it relies on a wiki-like structure which means it is sometime not updated as frequently or as accurately as old-style databases which used to employ people go over the data regularly.

We need this to understand how you use our service - you can take it out if you like. Cheers, your Blogspire team.


Enhanced by Zemanta

30K WordPress Blogs Infected With the Latest Malware Scam

March 8th, 2012 § Comments Off on 30K WordPress Blogs Infected With the Latest Malware Scam § permalink

And all my blogs were amongst them:

30K WordPress Blogs Infected With the Latest Malware Scam

alphadogg writes with an excerpt from an article over at Network World: “Almost 30,000 WordPress blogs have been infected in a new wave of attacks orchestrated by a cybercriminal gang whose primary goal is to distribute rogue antivirus software, researchers from security firm Websense say. via:

X Class Solar Flare Sends ‘Shockwaves’ on The ...

It caused me quite some headache yesterday, as I was trying to rescue content, move it to more reliable hosting, and reconfigure all the analytics and links. here’s what i ended up doing:

  1. delete all infected files from dreamhost immediately. this rendered all my content inaccessible to anyone, including myself, but at least it prevented the virus from spreading to my readers.
  2. download sql databases
  3. register for paid account and be nicely surprised by the free domain.
  4. sed all datadumps, so that the links will still work on new domain
  5. as I realized that hosted wordpress won’t allow me to use google analytics, i signed up for cloudflare – i love it, and highly recommend it
  6. install local wordpress on my machine, import datadumps, export as wordpress file
  7. import into new
  8. and while I was at it, i recategorized everything, into the 5 top level areas exposed on the left now.
hope it will bring better experience to everyone reading this…
Enhanced by Zemanta

Hello world!

March 5th, 2012 § Comments Off on Hello world! § permalink

Screenshots comparing the world maps in Pokémo...

Image via Wikipedia

thanks for stoping by, unfortunately my blog was hacked yesterday and i’m working on restoring it… please check back later.

Enhanced by Zemanta

Federated, Zemanta Launch Program to Connect Bloggers with Brands

February 13th, 2012 § Comments Off on Federated, Zemanta Launch Program to Connect Bloggers with Brands § permalink

I rarely write about my company here, but i’m exceptionally proud of this one:

Federated, Zemanta Launch Program to Connect Bloggers with Brands

Federated Media Publishing and Zemanta have announced a strategic partnership that will use technology to make it easier for bloggers and companies to connect, increasing opportunities to create targeted content marketing campaigns.

We need this to understand how you use our service - you can take it out if you like. Cheers, your Blogspire team.


Federated Media and John Battelle have been role models for all of us for years, and it’s a privilege to work with them.

Enhanced by Zemanta

Where Am I?

You are currently browsing the My Projects category at Rational Idealist.