当前位置: 首页 > news >正文

网站界面设计尺寸成都seo优化推广

网站界面设计尺寸,成都seo优化推广,2021年最新的网站,温州专业网站建设基于MindSpore的llama微调在OpenI平台上运行 克隆预训练模型 克隆chatglm-6b代码仓,下载分布式的模型文件 git lfs install git clone https://huggingface.co/openlm-research/open_llama_7b准备环境 安装Transformer pip install transformers执行转换脚本 …

基于MindSpore的llama微调在OpenI平台上运行

克隆预训练模型

克隆chatglm-6b代码仓,下载分布式的模型文件

git lfs install
git clone https://huggingface.co/openlm-research/open_llama_7b

准备环境

安装Transformer

pip install transformers

执行转换脚本

python mindformers/models/glm/convert_weight.py --pt_ckpt_path /home/ma-user/work/models/mindspore/pt_glm_6b.pth --ms_ckpt_path ../models/mindspore/ms_glm_6b.ckpt

注意可能会遇到以下错误:

执行转换脚本,得到转换后的输出文件ms_glm_6b.ckpt

解决方法:

export LD_PRELOAD=$LD_PRELOAD:/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/torch/lib/libgomp-d22c30c5.so.1 

原理:找到torch中的libgomp-d22c30c5.so.1 然后赋值给LD_PRELOAD环境变量,这个报错好像只有ARM平台会有

微调训练集准备

微调方式:lora

目前提供alpaca数据集的预处理脚本用于全参微调/lora微调任务。

数据集地址:https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json

alpaca数据集原始格式样例:

# alpaca examples:{"instruction": "Describe a time when you had to make a difficult decision.","input": "","output": "I had to make a difficult decision when I was working as a project manager at a construction company. I was in charge of a project that needed to be completed by a certain date in order to meet the client\u2019s expectations. However, due to unexpected delays, we were not able to meet the deadline and so I had to make a difficult decision. I decided to extend the deadline, but I had to stretch the team\u2019s resources even further and increase the budget. Although it was a risky decision, I ultimately decided to go ahead with it to ensure that the project was completed on time and that the client\u2019s expectations were met. The project was eventually successfully completed and this was seen as a testament to my leadership and decision-making abilities."},{"instruction": "Identify the odd one out.","input": "Twitter, Instagram, Telegram","output": "Telegram"},

执行alpaca_converter.py,使用fastchat工具添加prompts模板,将原始数据集转换为多轮对话格式

# 脚本路径:tools/dataset_preprocess/llama/alpaca_converter.py
# 执行转换脚本
python alpaca_converter.py \
--data_path /home/ma-user/work/data/alpaca_data.json \
--output_path /home/ma-user/work/data/alpaca-data-conversation.json

参数说明

# 参数说明
data_path: 存放alpaca数据的路径
output_path: 输出转换后对话格式的数据路径

转换后的样例:

{"id": "1","conversations": [{"from": "human","value": "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nGive three tips for staying healthy.\n\n### Response:"},{"from": "gpt","value": "1.Eat a balanced diet and make sure to include plenty of fruits and vegetables. \n2. Exercise regularly to keep your body active and strong. \n3. Get enough sleep and maintain a consistent sleep schedule."}]},

执行llama_preprocess.py,进行数据预处理、Mindrecord数据生成,将带有prompt模板的数据转换为mindrecord格式。

安装依赖:

pip install "fschat[model_worker,webui]"

执行脚本

# 脚本路径:tools/dataset_preprocess/llama/llama_preprocess.py
# 由于此工具依赖fschat工具包解析prompt模板,请提前安装fschat >= 0.2.13 python = 3.9
python llama_preprocess.py \
--dataset_type qa \
--input_glob /home/ma-user/work/data/alpaca-data-conversation.json \
--model_file /home/ma-user/work/models/open_llama_7b/tokenizer.model \
--seq_length 2048 \
--output_file /home/ma-user/work/models/alpaca-fastchat2048.mindrecord

lora微调

目前lora微调适配了llama_7b模型,并给出了默认配置文件config/llama/run_llama_7b_lora.yaml

  • step 1. 修改配置文件,参考全参微调修改训练数据集路径与预训练权重路径。
  • step 2. 启动lora微调任务。
    注:llama_7b_lora模型支持单卡启动,需将配置文件中的use_parallel参数置为False。

脚本启动

python run_mindformer.py --config=./configs/llama/run_llama_7b_lora.yaml --use_parallel=False --run_mode=finetune

run_llma_7b_lora.yaml

seed: 0
output_dir: './output'  # 当前不支持自定义修改,请勿修改该默认值
load_checkpoint: '/home/ma-user/work/models/mindspore/open_llama_7b_ms.ckpt'
src_strategy_path_or_dir: ''
auto_trans_ckpt: False  # If true, auto transform load_checkpoint to load in distributed model
only_save_strategy: False
resume_training: False
run_mode: 'finetune'# trainer config
trainer:type: CausalLanguageModelingTrainermodel_name: 'llama_7b_lora'# runner config
runner_config:epochs: 1batch_size: 2sink_mode: Truesink_size: 2# optimizer
optimizer:type: FP32StateAdamWeightDecaybeta1: 0.9beta2: 0.95eps: 1.e-8learning_rate: 1.e-4# lr sechdule
lr_schedule:type: CosineWithWarmUpLRlearning_rate: 1.e-4warmup_ratio: 0.03total_steps: -1 # -1 means it will load the total steps of the dataset# dataset
train_dataset: &train_datasetdata_loader:type: MindDatasetdataset_dir: "/home/ma-user/work/models/alpaca-fastchat2048.mindrecord"shuffle: Trueinput_columns: ["input_ids", "labels"]  # "input_ids", "labels" , labels are used in instruction finetune.num_parallel_workers: 8python_multiprocessing: Falsedrop_remainder: Truebatch_size: 2repeat: 1numa_enable: Falseprefetch_size: 1train_dataset_task:type: CausalLanguageModelDatasetdataset_config: *train_dataset
# if True, do evaluate during the training process. if false, do nothing.
# note that the task trainer should support _evaluate_in_training function.
do_eval: False# eval dataset
eval_dataset: &eval_datasetdata_loader:type: MindDatasetdataset_dir: "/home/ma-user/work/models/alpaca-fastchat2048.mindrecord"shuffle: Falseinput_columns: ["input_ids", "labels"]num_parallel_workers: 8python_multiprocessing: Falsedrop_remainder: Falserepeat: 1numa_enable: Falseprefetch_size: 1
eval_dataset_task:type: CausalLanguageModelDatasetdataset_config: *eval_datasetuse_parallel: False
# parallel context config
parallel:parallel_mode: 1 # 0-data parallel, 1-semi-auto parallel, 2-auto parallel, 3-hybrid parallelgradients_mean: Falseenable_alltoall: Falsefull_batch: Truesearch_mode: "sharding_propagation"enable_parallel_optimizer: Falsestrategy_ckpt_save_file: "./ckpt_strategy.ckpt"parallel_optimizer_config:gradient_accumulation_shard: Falseparallel_optimizer_threshold: 64
# default parallel of device num = 8 910A
parallel_config:data_parallel: 8model_parallel: 1pipeline_stage: 1use_seq_parallel: Falseoptimizer_shard: Falsemicro_batch_num: 1vocab_emb_dp: Truegradient_aggregation_group: 4
# when model parallel is greater than 1, we can set micro_batch_interleave_num=2, that may accelerate the train process.
micro_batch_interleave_num: 1# recompute config
recompute_config:recompute: Trueselect_recompute: Falseparallel_optimizer_comm_recompute: Falsemp_comm_recompute: Truerecompute_slice_activation: True# callbacks
callbacks:- type: MFLossMonitor- type: CheckpointMointorprefix: "llama_7b_lora"save_checkpoint_steps: 20000integrated_save: Falseasync_save: False- type: ObsMonitor# mindspore context init config
context:mode: 0 #0--Graph Mode; 1--Pynative Modedevice_target: "Ascend"enable_graph_kernel: Falsegraph_kernel_flags: "--disable_expand_ops=Softmax,Dropout --enable_parallel_fusion=true --reduce_fuse_depth=8 --enable_auto_tensor_inplace=true"max_call_depth: 10000max_device_memory: "31GB"save_graphs: Falsesave_graphs_path: "./graph"device_id: 0# model config
model:model_config:type: LlamaConfigbatch_size: 1 # add for increase predictseq_length: 2048hidden_size: 4096num_layers: 32num_heads: 32vocab_size: 32000multiple_of: 256rms_norm_eps: 1.0e-6bos_token_id: 1eos_token_id: 2pad_token_id: 0ignore_token_id: -100compute_dtype: "float16"layernorm_compute_dtype: "float32"softmax_compute_dtype: "float16"rotary_dtype: "float16"param_init_type: "float16"use_past: Falsepretrain_seqlen: 2048 # seqlen of the pretrain checkpoint: 2048 for llama and 4096 for llama2extend_method: "None" # support "None", "PI", "NTK"compute_in_2d: Falseuse_flash_attention: Falseoffset: 0use_past_shard: Falsecheckpoint_name_or_path: "llama_7b_lora"repetition_penalty: 1max_decode_length: 512top_k: 3top_p: 1do_sample: Falsepet_config:pet_type: lora# configuration of lorain_channels: 4096out_channels: 4096lora_rank: 16lora_alpha: 16lora_dropout: 0.05arch:type: LlamaForCausalLMWithLoraprocessor:return_tensors: mstokenizer:unk_token: '<unk>'bos_token: '<s>'eos_token: '</s>'pad_token: '<pad>'type: LlamaTokenizer# metric
metric:type: PerplexityMetric# wrapper cell config
runner_wrapper:type: MFTrainOneStepCellscale_sense:type: DynamicLossScaleUpdateCellloss_scale_value: 4294967296scale_factor: 2scale_window: 1000use_clip_grad: Trueeval_callbacks:- type: ObsMonitorauto_tune: False
filepath_prefix: './autotune'
autotune_per_step: 10profile: False
profile_start_step: 1
profile_stop_step: 10
init_start_profile: False
profile_communication: False
profile_memory: True
layer_scale: False
layer_decay: 0.65
lr_scale_factor: 256# cfts init config
remote_save_url: "Please input obs url on AICC platform."
http://www.rdtb.cn/news/16165.html

相关文章:

  • 风铃网做微网站要钱吗周口网站建设公司
  • 单位网站建设运维情况网络舆情监测
  • 佛山建设小学网站如何做seo整站优化
  • 网站 系统设置网上接单平台
  • 网站开发税点接app推广
  • 做云盘网站哪个好最新国际新闻热点事件
  • 做网站广告哪家好营销外包团队怎么收费
  • 盘锦网站设计百度手机app下载安装
  • 如何做企业网站及费用问题网站推广软件免费版下载
  • 潍城区住房和城乡建设局网站杭州百度竞价推广公司
  • 如何做弹幕视频网站网络营销是干嘛的
  • 大型网络游戏排行榜前十阿里巴巴关键词排名优化
  • 建网站备案好麻烦爱战网关键词挖掘
  • 威海住房建设局网站seo推广哪家服务好
  • 邯郸网站建设多少钱vivo应用商店
  • 官网站超链接怎么做百度推广平台登录网址
  • 网站推广公司就去柚米爱站关键词挖掘软件
  • 网站怎么做才会有收录小红书关键词检测
  • 微信做网站域名解析ip138在线查询
  • 做app网站的软件排名优化公司电话
  • wordpress简约seo专员的工作内容
  • 专业建站制作百度关键词怎么刷上去
  • 华强北做电子网站品牌关键词优化
  • 广州 营销型网站建设公司青岛官网seo
  • 维护网站的一般方法网店如何推广
  • vatage wordpress主题c盘优化大师
  • 做效果图兼职的网站松原新闻头条
  • 美工培训机构靠谱吗怎样做网站的优化、排名
  • 做网站首页的表格的代码2022百度搜索风云榜
  • 能自己在家做网站吗产品推广哪个平台好