百度360必应搜狗淘宝本站头条
当前位置:网站首页 > 技术流 > 正文

数据处理---Spring Batch之实践

citgpt 2024-06-27 19:59 8 浏览 0 评论

上面介绍了Spring Batch的基本概念和简单的demo项目,显然这些还是不够实际使用的。下面我们来更多的代码实践。


数据处理---Spring Batch之实践

在上面的基础项目上面,我们来更多的修改:


不用项目默认的hsql DB,用mysql,让ItemReader,ItemWriter 支持mysql;


支持处理结果自定义保存到数据库,我们用项目里面的JPA;


让Quartz来定时调用spring batch的Job


可以读取文件,而不只是数据库;


下面开始动手。


修改pom.xml,节选部分,实在太长了。


<spring.framework.version>3.2.0.RELEASE</spring.framework.version>

<spring.batch.version>2.1.7.RELEASE</spring.batch.version> <!-- 这个的版本和quartz的版本要注意,不然很容易在运行的时候出错 -->

<dependency>

<groupId>mysql</groupId>

<artifactId>mysql-connector-java</artifactId>

<version>5.1.23</version>

</dependency>

<dependency>

<groupId>org.springframework.data</groupId>

<artifactId>spring-data-jpa</artifactId>

<version>1.3.2.RELEASE.rebuild</version>

</dependency>

<dependency>

<groupId>org.hibernate</groupId>

<artifactId>hibernate-core</artifactId>

<version>3.6.10.Final</version>

</dependency>

<dependency>

<groupId>org.hibernate</groupId>

<artifactId>hibernate-entitymanager</artifactId>

<version>3.6.10.Final</version>

</dependency>

<dependency>

<groupId>org.quartz-scheduler</groupId>

<artifactId>quartz</artifactId>

<version>2.1.7</version>

</dependency>

<dependency>

<groupId>org.springframework</groupId>

<artifactId>spring-context-support</artifactId>

<version>3.2.0.RELEASE</version>

</dependency>

launch-context.xml定义需要用到的bean,

<bean id="entityManagerFactory"

class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">

<property name="persistenceUnitName" value="spring" />

<property name="dataSource" ref="dataSource" />

<property name="jpaDialect">

<bean class="com.test.batch.CustomHibernateJpaDialect" />

</property>

<property name="loadTimeWeaver">

<bean

class="org.springframework.instrument.classloading.InstrumentationLoadTimeWeaver" />

</property>

</bean>

<bean id="dataSource"

class="org.springframework.jdbc.datasource.DriverManagerDataSource">

<property name="driverClassName" value="com.mysql.jdbc.Driver" />

<property name="url"

value="jdbc:mysql://localhost:3306/testbatch?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true" />

<property name="username" value="root" />

<property name="password" value="root" />

</bean>

<bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">

<property name="entityManagerFactory" ref="entityManagerFactory" />

<property name="dataSource" ref="dataSource" />

</bean>

<bean id="customerMapper" class="com.test.batch.CustomerMapper"></bean>

<bean id="lineTokenizer" class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"></bean>

<!-- 这个可以让spring batch和hibernate共用

<bean id="itemReader"

class="org.springframework.batch.item.database.HibernateCursorItemReader">

<property name="sessionFactory" ref="sessionFactory" />

<property name="queryString" value="from CustomerCredit" />

</bean>

-->

<bean id="itemReaderFile" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">

<property name="lineMapper">

<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">

<property name="lineTokenizer" ref="lineTokenizer"/>

<property name="fieldSetMapper" ref="customerMapper"/>

</bean>

</property>

<!-- <property name="resource" value="file:#{jobParameters['customFileAbPath']}"/> 配置读取文件的路径,我固定了 -->

<property name="resource" value="file:D:/temp/testdata.txt"/>

</bean>

<bean id="itemReader" class="org.springframework.batch.item.database.JdbcCursorItemReader">

<property name="dataSource" ref="dataSource"/>

<property name="sql" value="select ID, NAME, CREDIT from CUSTOMER"/>

<property name="rowMapper">

<bean class="com.test.batch.CustomerCreditRowMapper"/>

</property>

<property name="fetchSize" value="100"></property>

<property name="maxRows" value="10000"></property>

</bean>


<jdbc:initialize-database data-source="dataSource">

<!-- <jdbc:script location="${batch.schema.script}" /> -->

<jdbc:script location="org/springframework/batch/core/schema-drop-mysql.sql" />

<jdbc:script location="org/springframework/batch/core/schema-mysql.sql" />

</jdbc:initialize-database


<!-- batch:job-repository id="jobRepository" / -->

<batch:job-repository id="jobRepository"

data-source="dataSource" transaction-manager="transactionManager"

isolation-level-for-create="SERIALIZABLE" table-prefix="BATCH_" />

<!-- 下面是Quartz调用spring batch 的job要用到的 -->

<bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">

<property name="triggers">

<list><ref bean="scheduledTrigger"></ref></list>

</property>

</bean>


<bean id="scheduledTrigger" class="org.springframework.scheduling.quartz.CronTriggerFactoryBean">

<property name="jobDetail" ref="jobDetail"/>

<property name="cronExpression">

<value>*/10 * * * * ?</value>

</property>

</bean>


<bean id="jobDetail" class="org.springframework.scheduling.quartz.JobDetailFactoryBean">

<property name="jobClass" value="com.test.batch.JobLauncherDetails" />

<property name="jobDataAsMap">

<map>

<entry key="jobName" value="job1" /> <!--指定jobID -->

<entry key="jobLocator" value-ref="jobRegistry" />

<entry key="jobLauncher" value-ref="jobLauncher" />

<entry key="param1" value="p1" />

<entry key="param2" value="p2" />

</map>

</property>

<property name="durability" value="true" />

</bean>


<bean id="jobLauncher"

class="org.springframework.batch.core.launch.support.SimpleJobLauncher">

<property name="jobRepository" ref="jobRepository" />

</bean>


<bean

class="org.springframework.batch.core.configuration.support.JobRegistryBeanPostProcessor">

<property name="jobRegistry" ref="jobRegistry" />

</bean>


<bean id="jobRegistry" class="org.springframework.batch.core.configuration.support.MapJobRegistry" />


<import resource="classpath:/META-INF/spring/module-context.xml" />


有几个定制的东西,需要说下。


我们用到的domain对象是这样的,很多不必要的都省略了


@Entity

@Table(name = "customer")

public class CustomerCredit {

public static final String TABLE_NAME = "customer";


@Column(name = "name", nullable = true, length = 12)

private String name;


@Id

@GeneratedValue(strategy = GenerationType.IDENTITY)

@Column(name = "id", nullable = false)

private Integer id;


@Column(name = "credit", nullable = true, precision = 15, scale = 2)

private BigDecimal credit;

FlatFileItemReader 使用默认的DefaultLineMapper,里面指定了lineTokenizer和fieldSetMapper,就是说要怎么读一行行的数据,读了之后怎么和Bean的字段Mapper,这样后面的步骤好继续处理。

public class CustomerMapper implements FieldSetMapper<CustomerCredit> {

@Override

public CustomerCredit mapFieldSet(FieldSet fieldSet) throws BindException {

CustomerCredit lv = new CustomerCredit();

lv.setId(Integer.parseInt(fieldSet.readString(0)));

lv.setName(fieldSet.readString(1));

lv.setCredit(fieldSet.readBigDecimal(2));

return lv;

}

}

如果是读数据库呢?就是下面那个JdbcCursorItemReader,里面指定了dataSource,sql,rowMapper,这些都类似文件


public class CustomerCreditRowMapper implements RowMapper {

public static final String ID_COLUMN = "id";

public static final String NAME_COLUMN = "name";

public static final String CREDIT_COLUMN = "credit";


public Object mapRow(ResultSet rs, int rowNum) throws SQLException {

CustomerCredit customerCredit = new CustomerCredit();

customerCredit.setId(rs.getInt(ID_COLUMN));

customerCredit.setName(rs.getString(NAME_COLUMN));

customerCredit.setCredit(rs.getBigDecimal(CREDIT_COLUMN));

return customerCredit;

}

}

还可以和hibernate结合来读取数据,可以读取存储过程的数据,等等,具体可以参考spring batch文档


说好的自定义处理数据呢,比如我们把每个人的Credit加10.


@Component("customProcessor")

public class CustomProcessor implements

ItemProcessor<CustomerCredit, CustomerCredit> {

@PersistenceContext

private EntityManager em;

@Override

public CustomerCredit process(CustomerCredit item) throws Exception {

System.out.println(new Date().toString()+"start to process");

if (item == null) {

return null;

}

try {

item = em.find(CustomerCredit.class, item.getId());

item.setCredit(item.getCredit().add(new BigDecimal(10)));

//find by id才可以persist (否则出现detached entity passed to persist)

em.persist(item);

} catch (Exception e) {

e.printStackTrace();

}

return item;

}

}

每一步的处理,我们想看看结果,可以定制一个ItemReadListener


public class CustomStepListener implements ItemReadListener<CustomerCredit> 具体代码省略

修改module-context.xml


<batch:job id="job1">

<batch:step id="step1">

<batch:tasklet transaction-manager="transactionManager"

start-limit="100">

<batch:chunk reader="itemReade" writer="itemwriter" processor="customProcessor"

commit-interval="3" />

</batch:tasklet>

</batch:step>

</batch:job>

在META-INF下面加个persistence.xml


<?xml version="1.0" encoding="UTF-8"?>

<persistence version="1.0"

xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd">

<persistence-unit name="spring" transaction-type="RESOURCE_LOCAL">

<!--

<class>com.test.jpatest.model.Customer</class>

<class>com.test.jpatest.model.Address</class>

-->

<provider>org.hibernate.ejb.HibernatePersistence</provider>

</persistence-unit>

</persistence>

Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'exampleConfiguration': Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private org.springframework.batch.core.repository.JobRepository com.test.batch.ExampleConfiguration.jobRepository; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'jobRepository': Cannot resolve reference to bean 'transactionManager' while setting bean property 'transactionManager'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'transactionManager' defined in class path resource [launch-context.xml]: Cannot resolve reference to bean 'entityManagerFactory' while setting bean property 'entityManagerFactory'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'entityManagerFactory' defined in class path resource [launch-context.xml]: Invocation of init method failed; nested exception is java.lang.IllegalArgumentException: No persistence unit with name 'spring' found

运行的时候遇到上面的错误,就是没有上面的配置的原因


org.springframework.transaction.InvalidIsolationLevelException: Standard JPA does not support custom isolation levels - use a special JpaDialect for your JPA implementation

at org.springframework.orm.jpa.DefaultJpaDialect.beginTransaction(DefaultJpaDialect.java:66)

at org.springframework.orm.jpa.vendor.HibernateJpaDialect.beginTransaction(HibernateJpaDialect.java:59)

遇到这样的错误,是因为默认的JPA不支持自定义的事物隔离级别。可以自定义一个CustomHibernateJpaDialect extends HibernateJpaDialect,具体代码没有列出,可以找下。



在用Quartz的时候遇到


Caused by: java.lang.IncompatibleClassChangeError: class org.springframework.scheduling.quartz.CronTriggerBean has interface org.quartz.CronTrigger as super class

这个是由于网上的很多例子都是quartz版本稍旧的原因,我用的是quartz 2.1.7



错误:Jobs added with no trigger must be durable

<property name="durability" value="true" />

坑真的是不少,需要一个个解决。最后测试一下,是不是定时执行我们的job:


public class App {

public static void main(String[] args) {

String springConfig = "launch-context.xml";

ApplicationContext context = new ClassPathXmlApplicationContext(

springConfig);

}

}

祝你好运,能够成功。

相关推荐

js中arguments详解

一、简介了解arguments这个对象之前先来认识一下javascript的一些功能:其实Javascript并没有重载函数的功能,但是Arguments对象能够模拟重载。Javascrip中每个函数...

firewall-cmd 常用命令

目录firewalldzone说明firewallzone内容说明firewall-cmd常用参数firewall-cmd常用命令常用命令 回到顶部firewalldzone...

epel-release 是什么

EPEL-release(ExtraPackagesforEnterpriseLinux)是一个软件仓库,它为企业级Linux发行版(如CentOS、RHEL等)提供额外的软件包。以下是关于E...

FullGC详解  什么是 JVM 的 GC
FullGC详解 什么是 JVM 的 GC

前言:背景:一、什么是JVM的GC?JVM(JavaVirtualMachine)。JVM是Java程序的虚拟机,是一种实现Java语言的解...

2024-10-26 08:50 citgpt

使用Spire.Doc组件利用模板导出Word文档
  • 使用Spire.Doc组件利用模板导出Word文档
  • 使用Spire.Doc组件利用模板导出Word文档
  • 使用Spire.Doc组件利用模板导出Word文档
  • 使用Spire.Doc组件利用模板导出Word文档
跨域(CrossOrigin)

1.介绍  1)跨域问题:跨域问题是在网络中,当一个网络的运行脚本(通常时JavaScript)试图访问另一个网络的资源时,如果这两个网络的端口、协议和域名不一致时就会出现跨域问题。    通俗讲...

微服务架构和分布式架构的区别

1、含义不同微服务架构:微服务架构风格是一种将一个单一应用程序开发为一组小型服务的方法,每个服务运行在自己的进程中,服务间通信采用轻量级通信机制(通常用HTTP资源API)。这些服务围绕业务能力构建并...

深入理解与应用CSS clip-path 属性
深入理解与应用CSS clip-path 属性

clip-pathclip-path是什么clip-path 是一个CSS属性,允许开发者创建一个剪切区域,从而决定元素的哪些部分可见,哪些部分会被隐...

2024-10-25 11:51 citgpt

HCNP Routing&Switching之OSPF LSA类型(二)
  • HCNP Routing&Switching之OSPF LSA类型(二)
  • HCNP Routing&Switching之OSPF LSA类型(二)
  • HCNP Routing&Switching之OSPF LSA类型(二)
  • HCNP Routing&Switching之OSPF LSA类型(二)
Redis和Memcached的区别详解
  • Redis和Memcached的区别详解
  • Redis和Memcached的区别详解
  • Redis和Memcached的区别详解
  • Redis和Memcached的区别详解
Request.ServerVariables 大全

Request.ServerVariables("Url")返回服务器地址Request.ServerVariables("Path_Info")客户端提供的路...

python操作Kafka

目录一、python操作kafka1.python使用kafka生产者2.python使用kafka消费者3.使用docker中的kafka二、python操作kafka细...

Runtime.getRuntime().exec详解

Runtime.getRuntime().exec详解概述Runtime.getRuntime().exec用于调用外部可执行程序或系统命令,并重定向外部程序的标准输入、标准输出和标准错误到缓冲池。...

promise.all详解 promise.all是干什么的
promise.all详解 promise.all是干什么的

promise.all详解promise.all中所有的请求成功了,走.then(),在.then()中能得到一个数组,数组中是每个请求resolve抛出的结果...

2024-10-24 16:21 citgpt

Content-Length和Transfer-Encoding详解
  • Content-Length和Transfer-Encoding详解
  • Content-Length和Transfer-Encoding详解
  • Content-Length和Transfer-Encoding详解
  • Content-Length和Transfer-Encoding详解

取消回复欢迎 发表评论: