So, last time I talked about Supertags, but now we’re actually going to see an implementation. This is clearly very rough around the edges, but here we go.
class SupertagParser def self.parse(s, weight = 1) r = [] t = [] a = s.split(Supertag::Separator) a.each do |i| unless i =~ /[!=~]/ t << i next end Supertag::Types.each do |k, v| if k == :conceptual or k == :oppositional if i.index(v.last) and i.index(v.last) > 0 tags = i.split(v.last).uniq tags.perm(2) do |x| r << [x.sort.first, x.sort.last, v.first, weight] end end end end end return r.uniq, t end end class Supertag Types = { :conceptual => [1, "="], :oppositional => [2, "!"], :chronological => [3, "~"] } Separator = " " end # Array#perm taken from http://blade.nagaokaut.ac.jp/~sinara/ruby/math/combinatorics/array-perm.rb # Author: Shin-ichiro Hara class Array def perm(n = size) if size < n or n < 0 elsif n == 0 yield([]) else self[1..-1].perm(n - 1) do |x| (0...n).each do |i| yield(x[0...i] + [first] + x[i..-1]) end end self[1..-1].perm(n) do |x| yield(x) end end end end
Okay, so say we have a hypothetical news item:
Ruby on Rails versus ASP.NET
Someone reads it, and tags it: rubyonrails=asp.net.
>> SupertagParser::parse("rubyonrails=asp.net")
=> [[["asp.net", "rubyonrails", 1, 1]], []]
What we get back is an array of arrays and a array. The arrays are relationship weights. That solo array is any other tags that didn’t receive any kind of relationship.
>> SupertagParser::parse("rubyonrails=asp.net othertag")
=> [[["asp.net", "rubyonrails", 1, 1]], ["othertag"]]
Say you were creating a website in Ruby on Rails that utilizes supertags. The arrays can just get passed to a Relationship model, and saved in, and if that relationship already exists (notice if alphabetizes to reduce the chance of redundancy), it can just increase the weight of that relationship in the database.
We can handle permutations. Relationship tags don’t have to be limited to two tags. “XFS versus EXT3 versus ZFS” could be tagged xfs=ext3=zfs:
>> SupertagParser::parse("xfs=ext3=zfs")
=> [[["ext3", "xfs", 1, 1], ["xfs", "zfs", 1, 1], ["ext3", "zfs", 1, 1]], []]
We get all the right relationships. This is done by an [extension to the Array class] I found.
This is the basic idea. So far I’ve taken a look at implementing conceptual (=), oppositional (!), and chronological (~) relationships. Chronological has to be handled differently because we can’t alphabetize them. xp~vista needs to be kept in the correct order.
Additionally I don’t prevent multiple relationship types in a relationship tag, but the output is meaningless. asp.net!rubyonrails=othertag shouldn’t be allowed until we can have some kind of order of precedence or something, but that’s probably too complicated for the average web visitor anyway.
So that’s the basic idea.