<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: A quick foray into linear algebra and Python: tf-idf</title>
	<atom:link href="http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/feed/" rel="self" type="application/rss+xml" />
	<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/</link>
	<description>A tagline? What am I&#8230;a super hero?</description>
	<lastBuildDate>Fri, 15 May 2009 23:33:19 -0700</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Tim Trueman</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-384</link>
		<dc:creator>Tim Trueman</dc:creator>
		<pubDate>Wed, 20 Aug 2008 01:21:07 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-384</guid>
		<description>@Thomas, glad you found it useful!</description>
		<content:encoded><![CDATA[<p>@Thomas, glad you found it useful!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Thomas</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-379</link>
		<dc:creator>Thomas</dc:creator>
		<pubDate>Tue, 19 Aug 2008 09:58:47 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-379</guid>
		<description>Wonderful post, it makes a technical concept from Natural Language Processing easily understandable. Hurrah for pseudocode, here&#039;s to Python.</description>
		<content:encoded><![CDATA[<p>Wonderful post, it makes a technical concept from Natural Language Processing easily understandable. Hurrah for pseudocode, here&#8217;s to Python.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fitzgerald Steele</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-338</link>
		<dc:creator>Fitzgerald Steele</dc:creator>
		<pubDate>Thu, 14 Aug 2008 18:45:22 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-338</guid>
		<description>Thanks for the simple explaination of TF-IDF.

Joséphine, and others...reading from a file in python is pretty straightforward.  Let&#039;s assume we&#039;re using Python 2.5+

&lt;code&gt;
f = open(&#039;file1.txt&#039;)
txt = f.read()
f.close()
documentList.append(txt)


Of course, we should put some error handling around it using a try/except, or the with statement if you&#039;re using Python 2.5 or better.</description>
		<content:encoded><![CDATA[<p>Thanks for the simple explaination of TF-IDF.</p>
<p>Joséphine, and others&#8230;reading from a file in python is pretty straightforward.  Let&#8217;s assume we&#8217;re using Python 2.5+</p>
<p><code><br />
f = open('file1.txt')<br />
txt = f.read()<br />
f.close()<br />
documentList.append(txt)</p>
<p>Of course, we should put some error handling around it using a try/except, or the with statement if you're using Python 2.5 or better.</code></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim Trueman</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-336</link>
		<dc:creator>Tim Trueman</dc:creator>
		<pubDate>Wed, 13 Aug 2008 16:52:47 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-336</guid>
		<description>Joséphine,

What you are trying to do is beyond the scope of this tutorial. However, you can find out how to read from a file with a simple search query.</description>
		<content:encoded><![CDATA[<p>Joséphine,</p>
<p>What you are trying to do is beyond the scope of this tutorial. However, you can find out how to read from a file with a simple search query.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joséphine</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-335</link>
		<dc:creator>Joséphine</dc:creator>
		<pubDate>Wed, 13 Aug 2008 12:11:00 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-335</guid>
		<description>Hi Tim,
my purpose ist to compare wikipedia articles with wordnet words in order to find wich wikipedia article the right one to the wordnet word.
Thanks in advance for your answer.</description>
		<content:encoded><![CDATA[<p>Hi Tim,<br />
my purpose ist to compare wikipedia articles with wordnet words in order to find wich wikipedia article the right one to the wordnet word.<br />
Thanks in advance for your answer.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joséphine</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-334</link>
		<dc:creator>Joséphine</dc:creator>
		<pubDate>Wed, 13 Aug 2008 12:08:23 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-334</guid>
		<description>HI Tim,

You code ist very great!
But I have a question, how can I use it by reading from file?
thanks</description>
		<content:encoded><![CDATA[<p>HI Tim,</p>
<p>You code ist very great!<br />
But I have a question, how can I use it by reading from file?<br />
thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim Trueman</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-110</link>
		<dc:creator>Tim Trueman</dc:creator>
		<pubDate>Mon, 30 Jun 2008 20:19:00 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-110</guid>
		<description>KARL,

This code doesn&#039;t read from a file at all, in order to keep it simple and avoid unnecessary code I just made it use strings as if they were separate documents.

The way to use this code is to replace &lt;code&gt;”&quot;”DOCUMENT #1 TEXT”&quot;”&lt;/code&gt; with your space-delimited document inside the three quotes (e.g. &lt;code&gt;&quot;&quot;&quot;The quick brown fox jumped over the lazy dog&quot;&quot;&quot;&lt;/code&gt;). Do this for each document you are using.</description>
		<content:encoded><![CDATA[<p>KARL,</p>
<p>This code doesn&#8217;t read from a file at all, in order to keep it simple and avoid unnecessary code I just made it use strings as if they were separate documents.</p>
<p>The way to use this code is to replace <code>”"”DOCUMENT #1 TEXT”"”</code> with your space-delimited document inside the three quotes (e.g. <code>"""The quick brown fox jumped over the lazy dog"""</code>). Do this for each document you are using.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: KARL</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-109</link>
		<dc:creator>KARL</dc:creator>
		<pubDate>Sun, 29 Jun 2008 10:31:32 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-109</guid>
		<description>Hi Tim

thanks for this code - need to do term frequency calc for entries in a database. 
not really that familiar with python. If i have a space separated file of words - where in the code above do i put the file name?
thanks</description>
		<content:encoded><![CDATA[<p>Hi Tim</p>
<p>thanks for this code &#8211; need to do term frequency calc for entries in a database.<br />
not really that familiar with python. If i have a space separated file of words &#8211; where in the code above do i put the file name?<br />
thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: m13</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-106</link>
		<dc:creator>m13</dc:creator>
		<pubDate>Fri, 20 Jun 2008 10:53:26 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-106</guid>
		<description>yes, currently I am writing my own tokenizer that will be applied before feature selection/weighting, that&#039;s why I needed to understand whether this code includes (simple) tokenization or not (as I already mentioned, I dont know python that well) ;)

thank you so much for replying in such a short time!</description>
		<content:encoded><![CDATA[<p>yes, currently I am writing my own tokenizer that will be applied before feature selection/weighting, that&#8217;s why I needed to understand whether this code includes (simple) tokenization or not (as I already mentioned, I dont know python that well) <img src='http://timtrueman.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>thank you so much for replying in such a short time!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim Trueman</title>
		<link>http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/comment-page-1/#comment-105</link>
		<dc:creator>Tim Trueman</dc:creator>
		<pubDate>Thu, 19 Jun 2008 21:43:56 +0000</pubDate>
		<guid isPermaLink="false">http://timtrueman.com/2008/02/10/a-quick-foray-into-linear-algebra-and-python-tf-idf/#comment-105</guid>
		<description>Glad you&#039;ve found it somewhat useful! &lt;code&gt;document.split(None)&lt;/code&gt; takes the document and splits it into an array containing each word in the document (None means use spaces as the separator). This is effectively a simple tokenizer. A more sophisticated tokenizer would provide better results perhaps but for the example I decided it was good enough.

Does that answer your question(s)?</description>
		<content:encoded><![CDATA[<p>Glad you&#8217;ve found it somewhat useful! <code>document.split(None)</code> takes the document and splits it into an array containing each word in the document (None means use spaces as the separator). This is effectively a simple tokenizer. A more sophisticated tokenizer would provide better results perhaps but for the example I decided it was good enough.</p>
<p>Does that answer your question(s)?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.433 seconds -->
