全国免费咨询:

13245491521

VR图标白色 VR图标黑色
X

中高端软件定制开发服务商

与我们取得联系

13245491521     13245491521

2024-11-07_《花100块做个摸鱼小网站! 》第五篇—通过xxl-job定时获取热搜数据

您的位置:首页 >> 新闻 >> 行业资讯

《花100块做个摸鱼小网站! 》第五篇—通过xxl-job定时获取热搜数据 点击关注公众号,“技术干货”及时达! 一、前言我们已经成功实现了一个完整的热搜组件,从后端到前端,构建了这个小网站的核心功能。接下来,我们将不断完善其功能,使其更加美观和实用。今天的主题是如何定时获取热搜数据。如果热搜数据无法定时更新,小网站将失去其核心价值。之前,我采用了@Scheduled注解来实现定时任务,但这种方式灵活性不足,因此我决定用更灵活的XXL-Job组件来替代它。 二、xxl-job部署xxl-job是一个轻量级分布式任务调度平台,其核心设计目标是开发迅速、学习简单、轻量级、易扩展。目前github代码库star 27.3k,开源免费的,值得学习使用一下。 1. 代码库下载github代码库地址 下载下来后,代码库结构如下: 源码结构如下: xxl-job-admin:调度中心xxl-job-core:公共依赖xxl-job-executor-samples:执行器Sample示例(选择合适的版本执行器,可直接使用,也可以参考其并将现有项目改造成执行器) :xxl-job-executor-sample-springboot:Springboot版本,通过Springboot管理执行器,推荐这种方式; :xxl-job-executor-sample-frameless:无框架版本; 调度中心配置内容说明: ### 调度中心JDBC链接:链接地址请保持和 2.1章节 所创建的调度数据库的地址一致spring.datasource.url=jdbc:mysql://127.0.0.1:3306/xxl_job?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&serverTimezone=Asia/Shanghaispring.datasource.username=xxxspring.datasource.password=xxxspring.datasource.driver-class-name=com.mysql.jdbc.Driver ### 报警邮箱spring.mail.host=smtp.qq.comspring.mail.port=25spring.mail.username=xxx@qq.comspring.mail.password=xxxspring.mail.properties.mail.smtp.auth=truespring.mail.properties.mail.smtp.starttls.enable=truespring.mail.properties.mail.smtp.starttls.required=truespring.mail.properties.mail.smtp.socketFactory.class=javax.net.ssl.SSLSocketFactory ### 调度中心通讯TOKEN [选填]:非空时启用;xxl.job.accessToken= ### 调度中心国际化配置 [必填]:默认为 "zh_CN"/中文简体, 可选范围为 "zh_CN"/中文简体, "zh_TC"/中文繁体 and "en"/英文;xxl.job.i18n=zh_CN ## 调度线程池最大线程配置【必填】xxl.job.triggerpool.fast.max=200xxl.job.triggerpool.slow.max=100 ### 调度中心日志表数据保存天数 [必填]:过期日志自动清理;限制大于等于7时生效,否则, 如-1,关闭自动清理功能;xxl.job.logretentiondays=30 2. 表结构初始化在doc目录的 db目录下,有一个sql文件,里面有一些表和数据的初始化sql,我们要在执行XXL-Job之前要把表和数据准备好。 执行结束后,表如下: 3. 启动XXL-Job找到XxlJobAdminApplication,启动该应用,在浏览器输入:http://localhost:12000/xxl-job-admin/toLogin,会进入XXL-Job登录界面,如下: 输入用户名:admin;密码:123456点击登录进入主界面,如下: 三、自定义爬虫任务XXL-Job的使用也很简单,一个注解就好了,这里我说一下如何使用它。 1. 引入XXL-Job依赖在summo-sbmy-job的pom.xml下添加: !-- xxl-job --dependency groupIdcom.xuxueli/groupId artifactIdxxl-job-core/artifactId version2.4.1/version/dependency2. XXL-Job配置在application.preoperties文件中加入XXL-Job的配置,配置如下: # xxl-jobxxl.job.open=true### xxl-job admin address list, such as "http://address" or "http://address01,http://address02"xxl.job.admin.addresses=http://127.0.0.1:12000/xxl-job-admin### xxl-job, access tokenxxl.job.accessToken=default_token### xxl-job executor appnamexxl.job.executor.appname=summo-sbmy### xxl-job executor log-pathxxl.job.executor.logpath=/root/logs/xxl-job/jobhandler### xxl-job executor log-retention-daysxxl.job.executor.logretentiondays=30### xxl-job executor registry-address: default use address to registry , otherwise use ip:port if address is nullxxl.job.executor.address=### xxl-job executor server-infoxxl.job.executor.ip=xxl.job.executor.port=9999配置弄好之后,在com.summo.sbmy.job.config目录下创建一个config文件,创建XxlJobConfig.java,代码如下: package com.summo.sbmy.job.config; import com.xxl.job.core.executor.impl.XxlJobSpringExecutor;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.beans.factory.annotation.Value;import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;import org.springframework.context.annotation.Bean;import org.springframework.context.annotation.Configuration; /** * xxl-job config * * @author xuxueli 2017-04-28 */@Configurationpublic class XxlJobConfig { private Logger logger = LoggerFactory.getLogger(XxlJobConfig.class); @Value("${xxl.job.admin.addresses}") private String adminAddresses; @Value("${xxl.job.accessToken}") private String accessToken; @Value("${xxl.job.executor.appname}") private String appname; @Value("${xxl.job.executor.address}") private String address; @Value("${xxl.job.executor.ip}") private String ip; @Value("${xxl.job.executor.port}") private int port; @Value("${xxl.job.executor.logpath}") private String logPath; @Value("${xxl.job.executor.logretentiondays}") private int logRetentionDays; @Bean @ConditionalOnProperty(name = "xxl.job.open", havingValue = "true") public XxlJobSpringExecutor xxlJobExecutor() { logger.info(" xxl-job config init."); XxlJobSpringExecutor xxlJobSpringExecutor = new XxlJobSpringExecutor(); xxlJobSpringExecutor.setAdminAddresses(adminAddresses); xxlJobSpringExecutor.setAppname(appname); xxlJobSpringExecutor.setAddress(address); xxlJobSpringExecutor.setIp(ip); xxlJobSpringExecutor.setPort(port); xxlJobSpringExecutor.setAccessToken(accessToken); xxlJobSpringExecutor.setLogPath(logPath); xxlJobSpringExecutor.setLogRetentionDays(logRetentionDays); return xxlJobSpringExecutor; }}配置和类都弄好之后,重新启动应用,如果顺利的话,在XXL-Job管理的执行器界面上就可以看到一个执行器已经注册了,如下: 4. 注册XXL-Job任务以抖音热搜为例,我们最开始使用的是 @Scheduled注解,代码如下: /** * 定时触发爬虫方法,1个小时执行一次 */@Scheduled(fixedRate = 1000 * 60 * 60)public void hotSearch() throws IOException{ ... ...}将@Scheduled注解替换为@XxlJob("douyinHotSearchJob"),具体的代码如下: package com.summo.sbmy.job.douyin; import com.alibaba.fastjson.JSONArray;import com.alibaba.fastjson.JSONObject;import com.google.common.collect.Lists;import com.summo.sbmy.dao.entity.SbmyHotSearchDO;import com.summo.sbmy.service.SbmyHotSearchService;import com.summo.sbmy.service.convert.HotSearchConvert;import com.xxl.job.core.biz.model.ReturnT;import com.xxl.job.core.handler.annotation.XxlJob;import lombok.extern.slf4j.Slf4j;import okhttp3.OkHttpClient;import okhttp3.Request;import okhttp3.Response;import org.apache.commons.collections4.CollectionUtils;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.scheduling.annotation.Scheduled;import org.springframework.stereotype.Component; import java.io.IOException;import java.util.List;import java.util.Random;import java.util.UUID;import java.util.stream.Collectors; import static com.summo.sbmy.common.cache.SbmyHotSearchCache.CACHE_MAP;import static com.summo.sbmy.common.enums.HotSearchEnum.DOUYIN; /** * @author summo * @version DouyinHotSearchJob.java, 1.0.0 * @description 抖音热搜Java爬虫代码 * @date 2024年08月09 */@Component@Slf4jpublic class DouyinHotSearchJob { @Autowired private SbmyHotSearchService sbmyHotSearchService; @XxlJob("douyinHotSearchJob") public ReturnTString hotSearch(String param) throws IOException { log.info("抖音热搜爬虫任务开始"); try { //查询抖音热搜数据 OkHttpClient client = new OkHttpClient().newBuilder().build(); Request request = new Request.Builder().url("https://www.iesdouyin.com/web/api/v2/hotsearch/billboard/word/").method("GET", null).build(); Response response = client.newCall(request).execute(); JSONObject jsonObject = JSONObject.parseObject(response.body().string()); JSONArray array = jsonObject.getJSONArray("word_list"); ListSbmyHotSearchDO sbmyHotSearchDOList = Lists.newArrayList(); for (int i = 0, len = array.size(); i i++) { //获取知乎热搜信息 JSONObject object = (JSONObject) array.get(i); //构建热搜信息榜 SbmyHotSearchDO sbmyHotSearchDO = SbmyHotSearchDO.builder().hotSearchResource(DOUYIN.getCode()).build(); //设置文章标题 sbmyHotSearchDO.setHotSearchTitle(object.getString("word")); //设置知乎三方ID sbmyHotSearchDO.setHotSearchId(getHashId(DOUYIN.getCode() + sbmyHotSearchDO.getHotSearchTitle())); //设置文章连接 sbmyHotSearchDO.setHotSearchUrl("https://www.douyin.com/search/" + sbmyHotSearchDO.getHotSearchTitle() + "?type=general"); //设置热搜热度 sbmyHotSearchDO.setHotSearchHeat(object.getString("hot_value")); //按顺序排名 sbmyHotSearchDO.setHotSearchOrder(i + 1); sbmyHotSearchDOList.add(sbmyHotSearchDO); } if (CollectionUtils.isEmpty(sbmyHotSearchDOList)) { return ReturnT.SUCCESS; } //数据加到缓存中 CACHE_MAP.put(DOUYIN.getCode(), sbmyHotSearchDOList.stream().map(HotSearchConvert::toDTOWhenQuery).collect(Collectors.toList())); //数据持久化 sbmyHotSearchService.saveCache2DB(sbmyHotSearchDOList); log.info("抖音热搜爬虫任务结束"); } catch (IOException e) { log.error("获取抖音数据异常", e); } return ReturnT.SUCCESS; } /** * 根据文章标题获取一个唯一ID * * @param title 文章标题 * @return 唯一ID */ private String getHashId(String title) { long seed = title.hashCode(); Random rnd = new Random(seed); return new UUID(rnd.nextLong(), rnd.nextLong()).toString(); } }在XXL-Job管理台的任务管理界面中点击新增任务,如下: 创建好任务后,我们可以手动运行一次,如下: 这样抖音的热搜任务我们就配置好了,其他的爬虫任务也是这样的配置。 四、热搜更新时间目前我们已经实现了三个热搜组件,百度、抖音、知乎,但是我们并不知道这些热搜是什么时候更新的,也不知道是不是实时的,所以我们需要把热搜更新时间放出来,大概下面这样子: 优化后组件代码如下: template el-card class="custom-card" v-loading="loading" template #header div class="card-title" img :src="icon" class="card-title-icon" / {{ title }}热榜 span class="update-time"{{ formattedUpdateTime }}/span /div /template div class="cell-group-scrollable" div v-for="item in hotSearchData" :key="item.hotSearchOrder" :class="getRankingClass(item.hotSearchOrder)" class="cell-wrapper" span class="cell-order"{{ item.hotSearchOrder }}/span span class="cell-title hover-effect" @click="openLink(item.hotSearchUrl)" {{ item.hotSearchTitle }} /span span class="cell-heat"{{ formatHeat(item.hotSearchHeat) }}/span /div /div /el-card/template scriptimport apiService from "@/config/apiService.js"; export default { props: { title: String, icon: String, type: String, }, data() { return { hotSearchData: [], updateTime: null, loading: false, }; }, created() { this.fetchData(this.type); }, computed: { formattedUpdateTime() { if (!this.updateTime) return ''; const updateDate = new Date(this.updateTime); const now = new Date(); const timeDiff = now - updateDate; const minutesDiff = Math.floor(timeDiff / 1000 / 60); if (minutesDiff 1) { return '刚刚更新'; } else if (minutesDiff 60) { return `${minutesDiff}分钟前更新`; } else if (minutesDiff 1440) { return `${Math.floor(minutesDiff / 60)}小时前更新`; } else { return updateDate.toLocaleString(); } }, }, methods: { fetchData(type) { this.loading = true; apiService .get("/hotSearch/queryByType?type=" + type) .then((res) = { this.hotSearchData = res.data.data.hotSearchDTOList; this.updateTime = res.data.data.updateTime; }) .catch((error) = { console.error(error); }) .finally(() = { this.loading = false; }); }, getRankingClass(order) { if (order === 1) return "top-ranking-1"; if (order === 2) return "top-ranking-2"; if (order === 3) return "top-ranking-3"; return ""; }, formatHeat(heat) { if (typeof heat === "string" && heat.endsWith("万")) { return heat; } let number = parseFloat(heat); if (isNaN(number)) { return heat; } if (number 1000) { return number.toString(); } if (number = 1000 && number 10000) { return (number / 1000).toFixed(1) + "k"; } if (number = 10000) { return (number / 10000).toFixed(1) + "万"; } }, openLink(url) { if (url) { window.open(url, "_blank"); } }, },};/script style scoped.custom-card { background-color: #ffffff; border-radius: 10px; box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1); margin-bottom: 20px;}.custom-card:hover { box-shadow: 0 6px 8px rgba(0, 0, 0, 0.25);}.el-card__header { padding: 10px 18px; display: flex; justify-content: space-between; /* Added to space out title and update time */ align-items: center;}.card-title { display: flex; align-items: center; font-weight: bold; font-size: 16px; flex-grow: 1;}.card-title-icon { fill: currentColor; width: 24px; height: 24px; margin-right: 8px;}.update-time { font-size: 12px; color: #b7b3b3; margin-left: auto; /* Ensures it is pushed to the far right */}.cell-group-scrollable { max-height: 350px; overflow-y: auto; padding-right: 16px; flex: 1;}.cell-wrapper { display: flex; align-items: center; padding: 8px 8px; border-bottom: 1px solid #e8e8e8; }.cell-order { width: 20px; text-align: left; font-size: 16px; font-weight: 700; margin-right: 8px; color: #7a7a7a; }.cell-heat { min-width: 50px; text-align: right; font-size: 12px; color: #7a7a7a;}.cell-title { font-size: 13px; color: #495060; line-height: 22px; flex-grow: 1; overflow: hidden; text-align: left; text-overflow: ellipsis; }.top-ranking-1 .cell-order { color: #fadb14; /* 金色 */}.top-ranking-2 .cell-order { color: #a9a9a9; /* 银色 */}.top-ranking-3 .cell-order { color: #d48806; /* 铜色 */}.cell-title.hover-effect { cursor: pointer; transition: color 0.3s ease; }.cell-title.hover-effect:hover { color: #409eff; }/style 优化后,我们看一下最终的样式,如下: ?这样,我们使用XXL-Job改造热搜组件就完成了,详细代码可以去看我的代码仓库。 ?番外:B站热搜爬虫1. 爬虫方案评估B站不是热搜,是热门视频, 但逻辑是一样的,它的接口是:https://api.bilibili.com/x/web-interface/ranking/v2 ?这个接口返回的是JSON格式数据,这就很简单了,看下结构就行。 ?2. 网页解析代码这个就可以使用Postman生成调用代码,流程我就不赘述了,直接上代码,BilibiliHotSearchJob: package com.summo.sbmy.job.bilibili; import java.io.IOException;import java.util.Calendar;import java.util.List;import java.util.stream.Collectors; import com.alibaba.fastjson.JSONArray;import com.alibaba.fastjson.JSONObject; import com.google.common.collect.Lists;import com.summo.sbmy.common.model.dto.HotSearchDetailDTO;import com.summo.sbmy.dao.entity.SbmyHotSearchDO;import com.summo.sbmy.service.SbmyHotSearchService;import com.summo.sbmy.service.convert.HotSearchConvert;import com.xxl.job.core.biz.model.ReturnT;import com.xxl.job.core.handler.annotation.XxlJob;import lombok.extern.slf4j.Slf4j;import okhttp3.OkHttpClient;import okhttp3.Request;import okhttp3.Response;import org.apache.commons.collections4.CollectionUtils;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.stereotype.Component; import static com.summo.sbmy.common.cache.SbmyHotSearchCache.CACHE_MAP;import static com.summo.sbmy.common.enums.HotSearchEnum.BILIBILI; /** * @author summo * @version BilibiliHotSearchJob.java, 1.0.0 * @description B站热榜Java爬虫代码 * @date 2024年08月19 */@Component@Slf4jpublic class BilibiliHotSearchJob { @Autowired private SbmyHotSearchService sbmyHotSearchService; @XxlJob("bilibiliHotSearchJob") public ReturnTString hotSearch(String param) throws IOException { log.info("B站热搜爬虫任务开始"); try { //查询B站热搜数据 OkHttpClient client = new OkHttpClient().newBuilder().build(); Request request = new Request.Builder().url("https://api.bilibili.com/x/web-interface/ranking/v2") .addHeader("User-Agent", "Mozilla/5.0 (compatible)").addHeader("Cookie", "b_nut=1712137652; " + "buvid3=DBA9C433-8738-DD67-DCF5" + "-DDC780CA892052512infoc").method("GET", null).build(); Response response = client.newCall(request).execute(); JSONObject jsonObject = JSONObject.parseObject(response.body().string()); JSONArray array = jsonObject.getJSONObject("data").getJSONArray("list"); ListSbmyHotSearchDO sbmyHotSearchDOList = Lists.newArrayList(); for (int i = 0, len = array.size(); i i++) { //获取B站热搜信息 JSONObject object = (JSONObject)array.get(i); //构建热搜信息榜 SbmyHotSearchDO sbmyHotSearchDO = SbmyHotSearchDO.builder().hotSearchResource(BILIBILI.getCode()) .build(); //设置B站三方ID sbmyHotSearchDO.setHotSearchId(object.getString("aid")); //设置文章连接 sbmyHotSearchDO.setHotSearchUrl(object.getString("short_link_v2")); //设置文章标题 sbmyHotSearchDO.setHotSearchTitle(object.getString("title")); //设置作者名称 sbmyHotSearchDO.setHotSearchAuthor(object.getJSONObject("owner").getString("name")); //设置作者头像 sbmyHotSearchDO.setHotSearchAuthorAvatar(object.getJSONObject("owner").getString("face")); //设置文章封面 sbmyHotSearchDO.setHotSearchCover(object.getString("pic")); //设置热搜热度 sbmyHotSearchDO.setHotSearchHeat(object.getJSONObject("stat").getString("view")); //按顺序排名 sbmyHotSearchDO.setHotSearchOrder(i + 1); sbmyHotSearchDOList.add(sbmyHotSearchDO); } if (CollectionUtils.isEmpty(sbmyHotSearchDOList)) { return ReturnT.SUCCESS; } //数据加到缓存中 CACHE_MAP.put(BILIBILI.getCode(), HotSearchDetailDTO.builder() //热搜数据 .hotSearchDTOList( sbmyHotSearchDOList.stream().map(HotSearchConvert::toDTOWhenQuery).collect(Collectors.toList())) //更新时间 .updateTime(Calendar.getInstance().getTime()).build()); //数据持久化 sbmyHotSearchService.saveCache2DB(sbmyHotSearchDOList); log.info("B站热搜爬虫任务结束"); } catch (IOException e) { log.error("获取B站数据异常", e); } return ReturnT.SUCCESS; } }?看下效果,第一行的4个热搜已经出来了,如下: ? 点击关注公众号,“技术干货”及时达!

上一篇:2025-06-05_「转」稳中求变,变中求新:2025年营销传播趋势 下一篇:2023-06-09_2亿多人在用的微信读书,把我的Kindle救活了。

TAG标签:

19
网站开发网络凭借多年的网站建设经验,坚持以“帮助中小企业实现网络营销化”为宗旨,累计为4000多家客户提供品质建站服务,得到了客户的一致好评。如果您有网站建设网站改版域名注册主机空间手机网站建设网站备案等方面的需求...
请立即点击咨询我们或拨打咨询热线:13245491521 13245491521 ,我们会详细为你一一解答你心中的疑难。
项目经理在线

相关阅读 更多>>

猜您喜欢更多>>

我们已经准备好了,你呢?
2022我们与您携手共赢,为您的企业营销保驾护航!

不达标就退款

高性价比建站

免费网站代备案

1对1原创设计服务

7×24小时售后支持

 

全国免费咨询:

13245491521

业务咨询:13245491521 / 13245491521

节假值班:13245491521()

联系地址:

Copyright © 2019-2025      ICP备案:沪ICP备19027192号-6 法律顾问:律师XXX支持

在线
客服

技术在线服务时间:9:00-20:00

在网站开发,您对接的直接是技术员,而非客服传话!

电话
咨询

13245491521
7*24小时客服热线

13245491521
项目经理手机

微信
咨询

加微信获取报价