Tess4j maven demo[通俗易懂]

全栈程序员-站长 • 2022年6月14日上午10:46 • 未分类 • 阅读 32

Tess4j maven demo[通俗易懂]tess4j实现文字识别Demo，下面为内容实现源码，内容仅为一个demo,demo下载地址:tess4jDemopublicclassTess4JTest{privatestaticfinalLoggerlogger=LoggerFactory.getLogger(newLoggHelper().toString());staticfinaldo…

大家好，又见面了，我是你们的朋友全栈君。

tess4j 实现文字识别Demo，下面为内容实现源码，内容仅为一个demo,demo下载地址:tess4jDemo

public class Tess4JTest {

    private static final Logger logger = LoggerFactory.getLogger(new LoggHelper().toString());
    static final double MINIMUM_DESKEW_THRESHOLD = 0.05d;
    ITesseract instance;

    private final String datapath = "src/test/resources";
    private final String testResourcesDataPath = "src/test/resources/test-data";
    private final String testResourcesLanguagePath = "src/test/resources/tessdata";

    @BeforeClass
    public static void setUpClass() throws Exception {
    }

    @AfterClass
    public static void tearDownClass() throws Exception {
    }

    @Before
    public void setUp() {
        instance = new Tesseract();
        instance.setDatapath(new File(datapath).getPath());
    }

    @After
    public void tearDown() {
    }

    /**
     * Test of doOCR method, of class Tesseract.
     * 根据图片文件进行识别
     * @throws Exception while processing image.
     */
    @Test
    public void testDoOCR_File() throws Exception {
        logger.info("doOCR on a jpg image");
        File imageFile = new File(this.testResourcesDataPath, "0099.png");
        //set language
        instance.setDatapath(testResourcesLanguagePath);
        instance.setLanguage("chi_sim");
        String result = instance.doOCR(imageFile);
        logger.info(result);
    }

    /**
     * Test of doOCR method, of class Tesseract.
     * 根据图片流进行识别
     * @throws Exception while processing image.
     */
    @Test
    public void testDoOCR_BufferedImage() throws Exception {
        logger.info("doOCR on a buffered image of a PNG");
        File imageFile = new File(this.testResourcesDataPath, "ocr.png");
        BufferedImage bi = ImageIO.read(imageFile);

        //set language
        instance.setDatapath(testResourcesLanguagePath);
        instance.setLanguage("chi_sim");

        String result = instance.doOCR(bi);
        logger.info(result);
    }

    /**
     * Test of getSegmentedRegions method, of class Tesseract.
     * 得到每一个划分区域的具体坐标
     * @throws java.lang.Exception
     */
    @Test
    public void testGetSegmentedRegions() throws Exception {
        logger.info("getSegmentedRegions at given TessPageIteratorLevel");
        File imageFile = new File(testResourcesDataPath, "ocr.png");
        BufferedImage bi = ImageIO.read(imageFile);
        int level = TessPageIteratorLevel.RIL_SYMBOL;
        logger.info("PageIteratorLevel: " + Utils.getConstantName(level, TessPageIteratorLevel.class));
        List<Rectangle> result = instance.getSegmentedRegions(bi, level);
        for (int i = 0; i < result.size(); i++) {
            Rectangle rect = result.get(i);
            logger.info(String.format("Box[%d]: x=%d, y=%d, w=%d, h=%d", i, rect.x, rect.y, rect.width, rect.height));
        }

        assertTrue(result.size() > 0);
    }


    /**
     * Test of doOCR method, of class Tesseract.
     * 根据定义坐标范围进行识别
     * @throws Exception while processing image.
     */
    @Test
    public void testDoOCR_File_Rectangle() throws Exception {
        logger.info("doOCR on a BMP image with bounding rectangle");
        File imageFile = new File(this.testResourcesDataPath, "ocr.png");
        //设置语言库
        instance.setDatapath(testResourcesLanguagePath);
        instance.setLanguage("chi_sim");
        //划定区域
        // x,y是以左上角为原点，width和height是以xy为基础
        Rectangle rect = new Rectangle(84, 21, 15, 13);
        String result = instance.doOCR(imageFile, rect);
        logger.info(result);
    }

    /**
     * Test of createDocuments method, of class Tesseract.
     * 存储结果
     * @throws java.lang.Exception
     */
    @Test
    public void testCreateDocuments() throws Exception {
        logger.info("createDocuments for png");
        File imageFile = new File(this.testResourcesDataPath, "ocr.png");
        String outputbase = "target/test-classes/docrenderer-2";
        List<RenderedFormat> formats = new ArrayList<RenderedFormat>(Arrays.asList(RenderedFormat.HOCR, RenderedFormat.TEXT));

        //设置语言库
        instance.setDatapath(testResourcesLanguagePath);
        instance.setLanguage("chi_sim");

        instance.createDocuments(new String[]{imageFile.getPath()}, new String[]{outputbase}, formats);
    }

    /**
     * Test of getWords method, of class Tesseract.
     * 取词方法
     * @throws java.lang.Exception
     */
    @Test
    public void testGetWords() throws Exception {
        logger.info("getWords");
        File imageFile = new File(this.testResourcesDataPath, "ocr.png");

        //设置语言库
        instance.setDatapath(testResourcesLanguagePath);
        instance.setLanguage("chi_sim");

        //按照每个字取词
        int pageIteratorLevel = TessPageIteratorLevel.RIL_SYMBOL;
        logger.info("PageIteratorLevel: " + Utils.getConstantName(pageIteratorLevel, TessPageIteratorLevel.class));
        BufferedImage bi = ImageIO.read(imageFile);
        List<Word> result = instance.getWords(bi, pageIteratorLevel);

        //print the complete result
        for (Word word : result) {
            logger.info(word.toString());
        }
    }

    /**
     * Test of Invalid memory access.
     * 处理倾斜
     * @throws Exception while processing image.
     */
    @Test
    public void testDoOCR_SkewedImage() throws Exception {
        //设置语言库
        instance.setDatapath(testResourcesLanguagePath);
        instance.setLanguage("chi_sim");

        logger.info("doOCR on a skewed PNG image");
        File imageFile = new File(this.testResourcesDataPath, "ocr_skewed.jpg");
        BufferedImage bi = ImageIO.read(imageFile);
        ImageDeskew id = new ImageDeskew(bi);
        double imageSkewAngle = id.getSkewAngle(); // determine skew angle
        if ((imageSkewAngle > MINIMUM_DESKEW_THRESHOLD || imageSkewAngle < -(MINIMUM_DESKEW_THRESHOLD))) {
            bi = ImageHelper.rotateImage(bi, -imageSkewAngle); // deskew image
        }

        String result = instance.doOCR(bi);
        logger.info(result);
    }

}

版权声明：本文内容由互联网用户自发贡献，该文观点仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请联系我们举报，一经查实，本站将立刻删除。

发布者：全栈程序员-站长，转载请注明出处：https://javaforall.net/130445.html原文链接：https://javaforall.net

赞 (0)

0 0

关于作者

全栈程序员-站长

133.5K 文章

3 粉丝

本网站汇聚当前互联网主流语音，持续更新，欢迎关注公众号“全栈程序员社区”

burpsuite简单抓包教程[通俗易懂]

上一篇 2022年6月14日上午10:36

计算机网络知识汇总（超详细整理）

下一篇 2022年6月14日上午10:46

DeepSeek

DeepSeek初学教程 5 与 Flask 快速集成教程

DeepSeek初学教程 5 与 Flask 快速集成教程

全栈程序员-站长
2026年3月16日
5
linux修改用户名的命令_linux退出root用户命令

linux修改用户名的命令_linux退出root用户命令Linux将用户名修改后，还需要修改组名+家目录+UID这只会更改用户名，而其他的东西，比如用户组，家目录，UID等都保持不变。1、修改用户名$usermod-l新用户旧用户这只会更改用户名，而其他的东西，比如用户组、家目录、ID等都保持不变。注意：你需要从要改名的帐号中登出并杀掉该用户的所有进程，要杀掉该用户的所有进程可以执行下面命令$s…

全栈程序员-站长
2026年1月20日
5
openclaw

使用 OpenClaw 搭建企业微信 AI 助手

使用 OpenClaw 搭建企业微信 AI 助手

全栈程序员-站长
2026年3月13日
2
JavaScript循环计数器

JavaScript循环计数器JS 经常会遇到延迟执行的动作并且失败后自动尝试尝试 N 次之后就不再尝试的需求今天刚好又遇到于是写个闭包以后不断完善继续复用用法检查并计数第一个参数用来标记是尝试哪个动作的第二个参数是最大尝试次数返回 true 表示未达到最大值 false 表示超过最大值 Counter check play 3 执行前 3 次返回 true 第 4 次返回 false

全栈程序员-站长
2026年3月19日
2
I2C电平转换电路_i2c电平转换芯片

I2C电平转换电路_i2c电平转换芯片电平转换电路左侧位从机器件，后侧为单片机（主器件）完整的应用电路图电路图特此记录anlog2021年11月11日

全栈程序员-站长
2022年8月10日
6
WebGoat安装配置

WebGoat安装配置WebGoat 安装配置 WebGoat 是由 OWASP 维护的故意不安全的 Web 应用程序旨在教授 Web 应用程序安全性课程该程序演示了常见的服务器端应用程序缺陷这些练习旨在供人们学习应用程序安全性和渗透测试技术要从源码安装并运行 WebGoat 需要运用到许多工具并且环境也十分重要的本次工作是在 windows10 上进行的 WebGoat 所需要的工具有 1 Java8

全栈程序员-站长
2026年3月18日
2

发表回复

关注全栈程序员社区公众号