<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Flying memes &#187; Semantic Relations</title>
	<atom:link href="http://sandropaganotti.com/tag/semantic-relations/feed/" rel="self" type="application/rss+xml" />
	<link>http://sandropaganotti.com</link>
	<description>Just another WordPress weblog</description>
	<lastBuildDate>Fri, 23 Mar 2012 19:07:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>A semantic experiment for separate good from bad.</title>
		<link>http://sandropaganotti.com/2010/05/04/a-semantic-experiment-for-separate-good-from-bad/</link>
		<comments>http://sandropaganotti.com/2010/05/04/a-semantic-experiment-for-separate-good-from-bad/#comments</comments>
		<pubDate>Tue, 04 May 2010 11:37:35 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Algoritmi]]></category>
		<category><![CDATA[Semantic Relations]]></category>
		<category><![CDATA[wordnet]]></category>

		<guid isPermaLink="false">http://sandropaganotti.com/?p=379</guid>
		<description><![CDATA[Yesterday was sunday and I came up with a fascinating idea: what happens if I use wordnet to measure the distance between two words ? By assigning weights to all the relation types and by navigate this relations graph I thought to be able to measure the distance between a word and the others in [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday was sunday and I came up with a fascinating idea: what happens if I use <a href="http://github.com/roja/words">wordnet</a> to measure the distance between two words ? By assigning  weights to all the relation types and by navigate this relations graph I  thought to be able to measure the distance between a word and the others  in terms of the minimum sum of weights of the edges between each pair made of the chosen word and another.</p>
<p><span id="more-379"></span><br />
So I tried to assign weights using the relation type as discriminator, to make an example take the word &#8216;sword&#8217; and its relations:</p>
<pre><code>
related-term        relation type           assigned weight
weapon:             hypernym                5
backsword:          hyponym                 2
blade:              part_meronym            3
broadsword:         hyponym                 2
cavalry sword:      hyponym                 2
cutlas:             hyponym                 2
Excalibur:          instance_hyponym        2
falchion:           hyponym                 2
fencing sword:      hyponym                 2
foible:             part_meronym            3
forte:              part_meronym            3
haft:               part_meronym            3
hilt:               part_meronym            3
rapier:             hyponym                 2
point:              part_meronym            3
</code></pre>
<p>The weight I choose for each of the relation types tried to follow the statement<br />
&#8216;The more the words are related the less greater the number is&#8217;; so weapon is less<br />
related to sword than broadsword because the first express a concept broader than sword<br />
(also a nuclear bomb is a weapon); the second instead detail the word &#8216;sword&#8217; and<br />
make true the statement &#8216;A broadsword is always a sword&#8217; so it&#8217;s more related to the<br />
chosen word.</p>
<p>By following this general rule I associated a weight to each of the most common  relation types and wrote down a few lines of code in order to compute weights by navigate the relation graph:</p>
<pre>
<code class="ruby">
def compute_distances(weights = :default, max_depth_allowed = 6)

  # retrieve the list of the weights associated to each relation type
  # (its just an hash {:relation_type => weight})
  weights = CONFIG_FILE['distance']["#{weights}"]
  data = Words::Wordnet.new

  # get a list of sysnsets as a starting point (eg: red, crimson)
  synsets_to_analyze = self.synsets.map{|s| [s.synset_id,0,0]}
  synsets_to_store   = []

  # process the first element of the list
  # until the sysets_to_analyze stack is empty
  while(sys = synsets_to_analyze.shift) do
    sys_id,dis,dep = *sys; next if dep >= max_depth_allowed
    sys = Words::Synset.new(sys_id,data.wordnet_connection,nil) rescue next;

    # save the current sysnset words into an output array
    sys.words.each {|w|  synsets_to_store.unshift([sys_id,w,dis])}

    # put each of the sysnset related to this into the stack unless they
    # are already present
    sys.relations.each do |r|
      synsets_to_analyze.unshift(
        [r.destination.synset_id, dis + weights["#{r.relation_type}"],dep + 1]
      ) if r.is_semantic? and
           !synsets_to_store.find{|s| s.first == r.destination.synset_id}
    end
  end

  # now in sysnsets_to_store you have an array of the words each of them
  # with the weight that separe it from the starting synsets.
 # (now I store them on a db, but is just because the context is the same as the Abacus gem)
  synsets_to_store.each do |s|
    a_id = ArticleKey.find_by_the_key(s[1]).id rescue next
    self.distances.find_or_create_by_article_key_id( a_id, :distance => s[2])
  end

end
</code>
</pre>
<p>Here some of the results for &#8216;sword&#8217; with depth = 3:</p>
<pre><code>
sword:                           0
brand:                           0
steel:                           0
broadsword:                      2
rapier:                          2
tuck:                            2
backsword:                       2
fencing sword:                   2
falchion:                        2
Excalibur:                       2
cutlas:                          2
sabre:                           2
cavalry sword:                   2
saber:                           2
cutlass:                         2
foible:                          3
blade:                           3
hilt:                            3
forte:                           3
tip:                             3
peak:                            3
point:                           3
helve:                           3
haft:                            3
claymore:                        4
scimitar:                        4
saber:                           4
sabre:                           4
foil:                            4
epee:                            4
arm:                             5
basket hilt:                     5
head:                            5
weapon system:                   5
knife blade:                     5
weapon:                          5
widow's peak:                    5
cusp:                            5
razorblade:                      5
cutting edge:                    6
pommel:                          6
knob:                            6
knife edge:                      6
fire ship:                       7
shaft:                           7
slasher:                         7
missile:                         7
Greek fire:                      7
missile:                         7
weapon of mass destruction:      7
light arm:                       7
WMD:                             7
gun:                             7
flamethrower:                    7
pike:                            7
brass knucks:                    7
knucks:                          7
brass knuckles:                  7
knuckles:                        7
W:                               7
tomahawk:                        7
hatchet:                         7
lance:                           7
knuckle duster:                  7
bow and arrow:                   7
projectile:                      7
sling:                           7
bow:                             7
stun baton:                      7
spear:                           7
stun gun:                        7
convexity:                       8
cutting implement:               8
convex shape:                    8
portion:                         8
part:                            8
handle:                          8
grip:                            8
hold:                            8
handgrip:                        8
reap hook:                       9
knife:                           9
sticker:                         9
dagger:                          9
axe:                             9
file:                            9
awl:                             9
lawn mower:                      9
mower:                           9
scissors:                        9
ax:                              9
sickle:                          9
cone shape:                      9
conoid:                          9
cone:                            9
reaping hook:                    9
pencil:                          9
arrowhead:                       9
knife:                           9
pair of scissors:                9
spatula:                         9
spatula:                         9
alpenstock:                      9
instrument:                      10
weapons system:                  11
implements of war:               11
arms:                            11
munition:                        11
weaponry:                        11
</code></pre>
<p>Now, as you may notice, there still a lot of tuning to do; for example it is pretty strange that &#8216;weapon of mass destruction&#8217; is more semantically related to &#8216;sword&#8217; than &#8216;dagger&#8217; <img src='http://sandropaganotti.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> . </p>
<p>Anyway I&#8217;m pretty pleased of the results of this small experiment thus I&#8217;m still far from my initial idea: calculate the weight of each word of the dictionary in relation to &#8216;good&#8217; and &#8216;bad&#8217; and use these weights to estimate the &#8216;mood&#8217; of some common trends in twitter.</p>
]]></content:encoded>
			<wfw:commentRss>http://sandropaganotti.com/2010/05/04/a-semantic-experiment-for-separate-good-from-bad/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

