<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title>计算机视觉 - 标签 - Vindrin</title><link>https://vindrin.top/zh-cn/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E8%A7%86%E8%A7%89/</link><description>计算机视觉 - 标签 - Vindrin</description><generator>Hugo -- gohugo.io</generator><language>zh-CN</language><managingEditor>vindrin@outlook.com (Vindrin)</managingEditor><webMaster>vindrin@outlook.com (Vindrin)</webMaster><lastBuildDate>Wed, 18 Feb 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://vindrin.top/zh-cn/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E8%A7%86%E8%A7%89/" rel="self" type="application/rss+xml"/><item><title>本地 AI 图像分类器</title><link>https://vindrin.top/zh-cn/project/ai-image-classifier/</link><pubDate>Wed, 18 Feb 2026 00:00:00 +0000</pubDate><author>vindrin@outlook.com (Vindrin)</author><guid>https://vindrin.top/zh-cn/project/ai-image-classifier/</guid><description><![CDATA[<h1 id="本地-ai-图像分类器">本地 AI 图像分类器</h1>
<p>用文字描述来分类图片，完全本地运行，不依赖任何云 API。</p>
<h2 id="原理">原理</h2>
<p>使用 <strong>OpenAI CLIP</strong> 计算图片与一组文字标签之间的相似度，得分最高的标签即为分类结果。</p>
<div class="code-block code-line-numbers open" style="counter-reset: code-block 0">
    <div class="code-header language-python">
        <span class="code-title"><i class="arrow fas fa-angle-right" aria-hidden="true"></i></span>
        <span class="ellipses"><i class="fas fa-ellipsis-h" aria-hidden="true"></i></span>
        <span class="copy" title="复制到剪贴板"><i class="far fa-copy" aria-hidden="true"></i></span>
    </div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">PIL</span> <span class="kn">import</span> <span class="n">Image</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">clip</span><span class="o">,</span> <span class="nn">torch</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">model</span><span class="p">,</span> <span class="n">preprocess</span> <span class="o">=</span> <span class="n">clip</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="s2">&#34;ViT-B/32&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">image</span> <span class="o">=</span> <span class="n">preprocess</span><span class="p">(</span><span class="n">Image</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">&#34;photo.jpg&#34;</span><span class="p">))</span><span class="o">.</span><span class="n">unsqueeze</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">labels</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&#34;a cat&#34;</span><span class="p">,</span> <span class="s2">&#34;a dog&#34;</span><span class="p">,</span> <span class="s2">&#34;a car&#34;</span><span class="p">,</span> <span class="s2">&#34;a tree&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="n">text</span> <span class="o">=</span> <span class="n">clip</span><span class="o">.</span><span class="n">tokenize</span><span class="p">(</span><span class="n">labels</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">with</span> <span class="n">torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">    <span class="n">logits</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">model</span><span class="p">(</span><span class="n">image</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">probs</span> <span class="o">=</span> <span class="n">logits</span><span class="o">.</span><span class="n">softmax</span><span class="p">(</span><span class="n">dim</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">best</span> <span class="o">=</span> <span class="n">labels</span><span class="p">[</span><span class="n">probs</span><span class="o">.</span><span class="n">argmax</span><span class="p">()]</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;预测结果：</span><span class="si">{</span><span class="n">best</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span></span></span></code></pre></div></div>
<h2 id="界面">界面</h2>
<p>基于 Flask 的 Web UI，拖入图片，输入标签，立即得到分类结果。</p>
<h2 id="状态">状态</h2>
<p><code>v0.1.0</code> — 概念验证版。CPU 可运行，大批量图片较慢。</p>]]></description></item></channel></rss>