Kubernetes部署gitlab-runner

发表于 2018-11-08 | 分类于 gitlab

背景

首先，不得不吐槽一下gitlab的官方文档，写得确实很不清楚：https://docs.gitlab.com/runner/install/kubernetes.html
我按照这个文档的步骤是没成功的，最后报错信息是

2018/11/8 上午9:57:03 Listen address not defined, session server disabled builds=0
2018/11/8 上午9:57:04 ERROR: Checking for jobs... forbidden runner=xhAyxB5x
2018/11/8 上午9:57:08 ERROR: Checking for jobs... forbidden runner=xhAyxB5x
2018/11/8 上午9:57:11 ERROR: Checking for jobs... forbidden runner=xhAyxB5x
2018/11/8 上午9:57:11 ERROR: Runner http://1.2.3.4/*** is not healthy and will be disabled!

原因分析

ConfigMap，我也是折腾好久才看明白官方文档的这个Note：

第一：这个config.toml的token是runner注册成功后生成的token（或者说是一个随机ID），并不是注册token（registration token）

第二：注册token（registration token）是什么呢？就是gitlab runners管理界面那个token

注意！我们需要的是注册好的runner的token，应该是详细信息里的这个：

解决方法

1、如果你有注册好Runner，想从原来的模式切换到k8s，那么按照官方的配置，选择正确的token是没有问题的。

2、但是！大多数情况我们是需要直接从k8s新注册runner，有些同学可能会说，直接在官方的Pod中命令行模式注册就行。我也试过，失败！原因是ConfigMap挂载到容器的config.toml文件是只读的，在容器内gitlab-ci-multi-runner register填的参数无法生效。以下是我的方法，简单来说就是：不挂载ConfigMap，手动register。

步骤

我用的是Rancher做集群管理，所以操作都是在Rancher的控制台

创建namespace

选择集群->项目/命名空间->项目System->添加命名空间

部署gitlab-runner服务

项目System->工作负载->导入YAML

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: gitlab-runner
  namespace: gitlab
spec:
  replicas: 1
  selector:
    matchLabels:
      name: gitlab-runner
  template:
    metadata:
      labels:
        name: gitlab-runner
    spec:
      containers:
      - args:
        - run
        image: gitlab/gitlab-runner:latest
        imagePullPolicy: Always
        name: gitlab-runner
        volumeMounts:
        - mountPath: /etc/ssl/certs
          name: cacerts
          readOnly: true
      restartPolicy: Always
      volumes:
      - hostPath:
          path: /usr/share/ca-certificates/mozilla
        name: cacerts

几分钟后工作空间就会完成部署

注册激活gitlab-runner

直接通过Rancher进入Pod的命令行

/# gitlab-ci-multi-runner register
Runtime platform                                    arch=amd64 os=linux pid=44 revision=cf91d5e1 version=11.4.2
Running in system-mode.

Please enter the gitlab-ci coordinator URL (e.g. https://gitlab.com/):
【gitlab服务IP地址】
Please enter the gitlab-ci token for this runner:
【这个是registration token】
Please enter the gitlab-ci description for this runner:
[gitlab-runner-5f98946bdd-zspjp]: frist runner
Please enter the gitlab-ci tags for this runner (comma separated):
k8s
Registering runner... succeeded                     runner=xhAyxB5x
Please enter the executor: docker, ssh, docker+machine, docker-ssh+machine, docker-ssh, parallels, shell, virtualbox, kubernetes:
kubernetes
Runner registered successfully. Feel free to start it, but if it's running already the config should be automatically reloaded!
/# gitlab-ci-multi-runner start
Runtime platform                                    arch=amd64 os=linux pid=56 revision=cf91d5e1 version=11.4.2

确认注册成功

注册并激活成功后会在gitlab控制台runner列表显示出来，而且Last contact会显示上次通信时间，如果没有请检查网络防火墙之类的

Centos7 GitLab版本升级

发表于 2018-11-08 | 分类于 gitlab

前提条件

只有小版本更新才能用我的方法！！！比如我是从gitlab-ce-10.1.2升级到gitlab-ce-10.8.6，因为夸大版本服务依赖变更很大，建议用官方的方法。而且，夸大版本备份文件不兼容。

环境

操作系统：CentOS Linux release 7.4.1708
GitLab：gitlab-ce-10.1.2-ce.0.el7
安装方式：rpm

资源

下载gitlab-ce-10.8.6-ce.0.el7.x86_64.rpm

官方地址：https://packages.gitlab.com/gitlab/gitlab-ce
镜像地址：https://mirrors.tuna.tsinghua.edu.cn/gitlab-ce/yum/el7/

备份

gitlab默认把备份文件放到目录/var/opt/gitlab/backups，如果需要更改，请修改/etc/gitlab/gitlab.rb参数“backup_path”

1	gitlab-rake gitlab:backup:create

更新

关闭部分服务，注意是部分，如果关闭全部服务可能会失败

1
2
3

sudo gitlab-ctl stop unicorn
sudo gitlab-ctl stop sidekiq
sudo gitlab-ctl stop nginx

安装

1	rpm -Uvh gitlab-ce-10.8.6-ce.0.el7.x86_64.rpm

重新配置

1	gitlab-ctl reconfigure

重启

1	gitlab-ctl restart

add-apt-repository设置代理

发表于 2018-10-24 | 分类于操作系统

在使用add-apt-repository或apt-get过程中需要配置代理安装

但是，很多情况下export http_proxy & https_proxy不管用

解决方法

1
2
3

export http_proxy=http://<proxy>:<port>
export https_proxy=http://<proxy>:<port>
sudo -E add-apt-repository ppa:linaro-maintainers/toolchain

其实根本原因是，设置代理的用户和使用命令的用户不是同一个

可以用-E告诉sudo保留当前的环境变量

hive分桶表使用

发表于 2018-10-24 | 分类于 hive

分区和分桶

分区：

分区是表的部分列的集合
在许多场景下，可以通过分区的方法减少每一次扫描总数据量
每一个子目录包含了分区对应的列名和每一列的值，分区列不存储在数据文件中

分桶：

通过对指定列进行哈希计算来实现的，通过哈希值将一个列名下的数据切分为一组桶，并使每个桶对应于该列名下的一个存储文件
hive使用对分桶所用的值进行hash，并用hash结果除以桶的个数做取余运算的方式来分桶，保证了每个桶中都有数据，而且均匀分布

虽然分桶是hive的建表机制，但是，从实际测试结果来看，分桶确实对impala的JOIN查询性能有所提升，怀疑是一下原因：

增加文件数，增加并发度
根据uid数据均匀分布多个桶，很大程度上增加了本地命中率
桶内排序+impala统计信息，min/max存储索引加速过滤

更新
impala是能识别hive的分桶信息的：

创建分桶表

以下SQL，创建一个按pt_dt分区，按uid分10个桶的表，并且每个桶内按uid排序

CREATE TABLE t (
    uid STRING COMMENT '用户uid',
) 
PARTITIONED BY (pt_dt STRING)
CLUSTERED BY (uid) SORTED BY (uid ASC) INTO 10 BUCKETS
ROW FORMAT delimited FIELDS TERMINATED BY ','
STORED AS PARQUET TBLPROPERTIES('parquet.compression'='SNAPPY');

插入数据

注意：插入数据必须设置以下参数，不然会造成分桶数据不均：

1	SET hive.enforce.bucketing = true;

以下参数也需要配合设置

# 开启动态分区相关配置
SET hive.exec.dynamic.partition=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
# 根据分区数和数据量适当调整
SET hive.exec.max.dynamic.partitions=2048;
SET hive.exec.max.dynamic.partitions.pernode=512;
SET mapreduce.map.memory.mb=8000;
SET mapred.reduce.tasks=100;

插入数据

1 2	INSERT OVERWRITE TABLE t PARTITION(pt_dt) SELECT * from t2;

问题

如果在同一分区，不同批次INSERT INTO的数据，分桶文件不会追加，而是重新生成大量文件

比如，向同一分区分两次插入数据，分桶文件会*2

所以尽量使用INSERT OVERWRITE，避免INSERT INTO操作

ClickHouse本地表与分布式表

发表于 2018-10-22 | 分类于 ClickHouse

表引擎

ClickHouse表引擎决定了如下几个方面：

怎样存储数据 -将数据写到哪里, 怎样读取数据.
支持何种查询以及怎样支持.
并发数据访问.
索引的使用.
是否多线程的请求执行是可以的.
数据如何同步.

当读取数据时, 引擎只需要抽取必要的列簇. 然而，在一些场景下,查询可能在表引擎中是半处理状态.

在大多数场景中, 我们所使用的引擎主要是 MergeTree 家族

阅读全文 »

Python数据分析框架Ibis

发表于 2018-10-22 | 分类于 Python

ibis详细介绍

官网：https://docs.ibis-project.org/index.html

ibis是一个新的 Python 数据分析框架，它用来桥接本地Python环境（如：pandas、scikit-learn）与远程大数据环境（如：hdfs、hive、impala、spark等）。ibis目标是让数据科学家和数据工程师们处理大型数据时，能够像处理小中型数据一样的高效，充分利用单机资源。

环境

1	pip install ibis-framework

案例

实现以下SQL：

1 2	// ibis 默认数据量10000 SELECT * FROM d.t LIMIT 10000;

ibis的实现：

import ibis
# 客户端连接
client = ibis.impala.connect(host='0.0.0.0', port=20050, auth_mechanism="GSSAPI", kerberos_service_name='impala')
# 访问表
table = client.table('t', database='d')
# 查询
df = table.execute()
# 返回结果就是：pandas.core.frame.DataFrame类型
df.describe

稍微复杂一点的案例

实现以下SQL：

SELECT count(distinct(id)), pt_dt
FROM d.t
WHERE (pt_dt >= "2018-08-01"
       AND pt_dt <= "2018-08-28")
GROUP BY pt_dt;

ibis的实现：

import ibis
client = ibis.impala.connect(host='0.0.0.0', port=20050, auth_mechanism="GSSAPI", kerberos_service_name='impala')

# SELECT
table = client.table('t', database='d')
t = table['id', 'pt_dt']
# WHERE
filtered = t.filter([t.pt_dt >= "2018-08-20", t.pt_dt <= "2018-08-29"])
# DISTINCT
metric = t.uid.nunique()
# GROUP BY
expr = (filtered.group_by('pt_dt').aggregate(unique_uid=metric))
# RUN
df = expr.execute()
df.describe

Hive动态分区异常问题

发表于 2018-10-17 | 分类于 hive

异常信息

Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveFatalException: 
[Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. 
The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. 
Maximum was set to: 100

原因

Hive对其创建的动态分区数量实施限制。默认值为每个节点100个动态分区，所有节点的总（默认）限制为1000个动态分区。但是，这可以调整。

解决办法

1
2
3

set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions=2048;
set hive.exec.max.dynamic.partitions.pernode=256;

用以上设置后不能保证正常，有时候还需要设置reduce数来配合动态分区使用

1	set mapred.reduce.tasks=10;

这几个参数需要满足一下条件：

1	dynamic.partitions / dynamic.partitions.pernode <=mapred.reduce.tasks

比如上面的例子：2048 / 256 = 8，如果mapred.reduce.tasks小于8就会报错，而hive默认reduce数是跟具数据量来动态调整的，所以有时候需要手动调整

博客从0到1

发表于 2018-10-16 | 更新于 2018-10-22 | 分类于博客

搭建博客

博客组件

博客框架：hexo
托管服务：Github Pages
博客编辑器：hexo-editor
博客主题：theme-next

步骤简介

第一步：基于hexo+github搭建博客，网上有很多教程，如果我写得不够清楚可以参考其他

第二步：博客编辑器！

完成第一步博客搭建的同学都知道，博客编辑和发布流程不够便捷。这里我把自己的方案跟大家分享：VPS+hexo-editor

我的博客发布流程

打开博客编辑器：http://vps-IP:2048
编辑博客，publish

编译生成，相当于运行hexo g
发布，相当于hexo clean & hexo d

几分钟后，在github主页就能看到新发的博客！

基于hexo+github搭建博客

如果已经在github搭好博客的同学可以跳过这步

环境准备

VPS，我是用的vultr，支持微信、支付宝、Paypal大家自选

安装不赘述

Nodejs
Git

Hexo博客

安装Hexo

1	npm install -g hexo-cli

初始化博客项目

hexo会在当前工作目录创建blog项目

1	hexo init blog

进入blog文件夹安装依赖，部署形成的文件，启动本地服务

cd blog
npm install
hexo g
hexo s

现在我们打开http://localhost:4000/ 就可以看到我们刚才搭建的本地博客了，Hexo会默认生成一个Hello World的博文

部署到 GitHub Pages

创建个人主页仓库

每个帐号只能有一个仓库来存放个人主页，而且仓库的名字必须是username/username.github.io，这是特殊的命名约定。你可以通过http://username.github.io 来访问你的个人主页。

配置SSH

从客户端免密部署代码到github，需要配置ssh

// 在客户端配置基本信息
git config --global user.name " GitHub 用户名 "
git config --global user.email " GitHub 邮箱 "
// 生成秘钥
ssh-keygen -t rsa -C " 邮箱地址 "

SSH KEY 生成之后会默认保存在 ~/.ssh 目录中，打开这个目录，打开 id_rsa.pub 文件，复制全部内容，即复制密钥。

打开 GitHub ，依次点击头像–>Settings–>SSH and GPG keys–>New SSH key，将复制的密钥粘贴到 key 输入框，最后点击 Add Key ，SSH KEY 配置成功。

blog项目配置

修改 hexo 文件夹下的 _config.yml 全局配置文件，修改 deploy 属性代码，将本地 hexo 项目托管到 GitHub 上

deploy:
  type: git		#部署的类型
  repository: git@github.com:tyyzqmf/tyyzqmf.github.io.git # 仓库地址
  branch: master		#分支名称
  message: hexo deploy	#提交信息

发布

安装插件

1	npm install hexo-deployer-git --sava

发布

1	hexo clean && hexo g && hexo d

浏览器地址栏输入 http://username.github.io 访问，可以看到博客已经部署到 GitHub 上

如果有问题可以查看github报错信息，github项目–>Settings，下拉到GitHub Pages：

博客编辑器

编辑器安装

git clone https://github.com/tajpure/hexo-editor.git
cd hexo-editor
npm install --production
npm start

配置_config.yml

_config.yml位于hexo-editor根目录，提供了一些简单的配置选项给用户设置。

设置环境

如果在桌面环境使用，将local改为true，此时无需登录即可使用。
如果部署在服务器，请使用local的默认值false，此时需要配置用户名和密码。
设置用户名和密码
如果在桌面环境使用，请忽略此配置

将要使用的用户名和密码依次填到username和password中，请不要使用默认值。

设置博客目录(使用绝对路径)

将hexo博客的目录路径配置到base_dir, 例:如果你的博客目录为”/home/user/blog”，则将base_dir设为该地址。

如果需要使用hexo-editor提供的deploy功能，你需要设好hexo的deploy配置，并添加你的ssh key至deploy服务器的authorized_keys。使其不用密码即可deploy。

设置端口

默认端口为2048，如果想自定义端口，直接修改即可。

ClickHouse安装

发表于 2018-10-15 | 更新于 2018-10-22 | 分类于 ClickHouse

安装源

因为官方只提供ubuntu和Docker的安装方式，所以在Centos下安装需要找第三方源

环境准备

测试环境4台服务器

192.168.1.1    node1
192.168.1.2    node2
192.168.1.3    node3
192.168.1.4    node4

测试环境Zookeeper集群

1
2
3

zk1:2181
zk2:2181
zk3:2181

安装clickhouse

安装依赖

1	sudo yum install -y pygpgme yum-utils libicu

增加yum.repo

新建文件/etc/yum.repos.d/altinity_clickhouse.repo

[altinity_clickhouse]
name=altinity_clickhouse
baseurl=https://packagecloud.io/altinity/clickhouse/el/6/$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300

[altinity_clickhouse-source]
name=altinity_clickhouse-source
baseurl=https://packagecloud.io/altinity/clickhouse/el/6/SRPMS
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300

刷新yum缓存、安装

1 2	sudo yum -q makecache -y --disablerepo='*' --enablerepo='altinity_clickhouse' sudo yum install -y clickhouse-server clickhouse-client

修改配置文件

clickhouse默认的数据文件和配置文件都在/var，由于服务器的系统盘和数据盘都是单独挂载，显然这样的配置不合理。为了方便管理，我们把日志、配置文件都存储到一个统一的根路径。

修改/etc/rc.d/init.d/clickhouse-server文件

1
2
3

CLICKHOUSE_LOGDIR=/data/clickhouse/log
CLICKHOUSE_DATADIR_OLD=/data/clickhouse/data_old
CLICKHOUSE_CONFIG=/data/clickhouse/config.xml

修改最核心配置文件config.xml，复制到其他节点，注意主机名不同！

<?xml version="1.0"?>
<yandex>
    <logger>
        <level>trace</level>
        <log>/data/clickhouse/log/clickhouse-server/clickhouse-server.log</log>
        <errorlog>/data/clickhouse/log/clickhouse-server/clickhouse-server.err.log</errorlog>
        <size>1000M</size>
        <count>10</count>
        <!-- <console>1</console> --> <!-- Default behavior is autodetection (log to console if not daemon mode and is tty) -->
    </logger>
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <interserver_http_port>9009</interserver_http_port>

    <!-- 主机名-->
    <interserver_http_host>node1</interserver_http_host>
    
    <!-- 监听(集群配置必须)-->
    <listen_host>0.0.0.0</listen_host>
    
    <!-- 数据目录 -->
    <path>/data/clickhouse/</path>

    <tmp_path>/data/clickhouse/tmp/</tmp_path>

    <!-- 用户相关配置，暂时默认 -->
    <user_files_path>/data/clickhouse/user_files/</user_files_path>
    <users_config>users.xml</users_config>
    <default_profile>default</default_profile>
    <default_database>default</default_database>

    <!-- 集群配置 -->
    <remote_servers incl="clickhouse_remote_servers" >
        <test>
        	  <!-- 数据分片1  -->
            <shard>
                <internal_replication>false</internal_replication>
                <replica>
                    <host>192.168.1.1</host>
                    <port>9000</port>
                </replica>
            </shard>
        	  <!-- 数据分片2  -->
            <shard>
                <internal_replication>false</internal_replication>
                <replica>
                    <host>192.168.1.2</host>
                    <port>9000</port>
                </replica>
            </shard>
        	  <!-- 数据分片3  -->
            <shard>
                <internal_replication>false</internal_replication>
                <replica>
                    <host>192.168.1.3</host>
                    <port>9000</port>
                </replica>
            </shard>
        	  <!-- 数据分片4  -->
            <shard>
                <internal_replication>false</internal_replication>
                <replica>
                    <host>192.168.1.4</host>
                    <port>9000</port>
                </replica>
            </shard>
        </test>
    </remote_servers>
    
    <!-- ZK集群  -->
    <zookeeper incl="zookeeper-servers" optional="true">
        <node index="1">
            <host>zk1</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>zk2</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>zk3</host>
            <port>2181</port>
        </node>
    </zookeeper>
</yandex>

创建相关目录

1
2
3

mkdir -p /data/clickhouse/log
mkdir -p /data/clickhouse/data
chown -R clickhouse:clickhouse /data/clickhouse

启动服务

启动

1	service clickhouse-server start

测试

$ clickhouse-client

:) use system;

:) select * from clusters;

SELECT *
FROM clusters 

┌─cluster─┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─────┬─host_address──┬─port─┬─is_local─┬─user────┬─default_database─┐
│ test    │         1 │            1 │           1 │ 192.168.1.1   │ 192.168.1.1   │ 9000 │        0 │ default │                  │
│ test    │         2 │            1 │           1 │ 192.168.1.2   │ 192.168.1.2   │ 9000 │        1 │ default │                  │
│ test    │         3 │            1 │           1 │ 192.168.1.3   │ 192.168.1.3   │ 9000 │        0 │ default │                  │
│ test    │         4 │            1 │           1 │ 192.168.1.4   │ 192.168.1.4   │ 9000 │        0 │ default │                  │
└─────────┴───────────┴──────────────┴─────────────┴───────────────┴───────────────┴──────┴──────────┴─────────┴──────────────────┘

4 rows in set. Elapsed: 0.001 sec.