下载源码

git clone https://github.com/tensorflow/tensorflow

安装bazel 5.1.1

curl -fLO https://releases.bazel.build/5.1.1/release/bazel-5.1.1-linux-x86_64

chmod +x bazel-5.1.1-linux-x86_64

需要jdk1.8安装。 bazel安装。此处理略过

python3.10 & gcc

miniconda 安装python3.10环境,并激活

然后安装jupter notebook的C++内核

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
conda config --set show_channel_urls yes
conda config --set channel_alias https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
conda install xeus-cling -c conda-forge

参考:https://zhuanlan.zhihu.com/p/84753836

编译

conda install python==3.10
pip3 install numpy
conda install packaging
./configure
bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

#支持cuda的编译命令
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

安装

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-2.10.0-cp310-cp310-linux_x86_64.whl 

验证

python
Python 3.10.0 (default, Nov 10 2021, 19:16:14) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'2.10.0'
>>> 

执行自带test

在tensorflow/core/framework/BUILD文件里有tf_cc_test 和tf_cc_tests两种test定义。前者即普通定义一个test.而后者定义了多个test。后者不能直接通过name去执行,而是通过文件来执行。


tf_cc_tests(
    name = "higher_level_tests", #bazel test 不要直接使用这个名字
    size = "small",
    srcs = [
        "allocator_test.cc",
        "tensor_test.cc", #而是使用这个名字,去了后缀.cc
        
    ],
    linkopts = select({
        "//tensorflow:macos": ["-headerpad_max_install_names"],
        "//conditions:default": [],
    }),
    linkstatic = tf_kernel_tests_linkstatic(),
    visibility = [
        "//tensorflow:internal",
        "//tensorflow/core:__pkg__",
    ],
    deps = [
       
    ],
)
bazel test --test_output all -c opt --test_filter=TensorTest.Default  //tensorflow/core/framework:tensor_test

#如果在framework下有子目录test/tensor_test,则目录为:test_tensor_test. 要把/转为_

#执行结果
NFO: Found 1 test target...
INFO: From Testing //tensorflow/core/framework:tensor_test:
==================== Test output for //tensorflow/core/framework:tensor_test:
Note: Google Test filter = TensorTest.Default
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from TensorTest
[ RUN      ] TensorTest.Default
[       OK ] TensorTest.Default (0 ms)
[----------] 1 test from TensorTest (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[  PASSED  ] 1 test.
================================================================================
Target //tensorflow/core/framework:tensor_test up-to-date:
  bazel-bin/tensorflow/core/framework/tensor_test
INFO: Elapsed time: 86.390s, Critical Path: 42.22s
INFO: 872 processes: 305 internal, 567 local.
INFO: Build completed successfully, 872 total actions
//tensorflow/core/framework:tensor_test                                  PASSED in 1.0s

op扩展 zero_out

参照官方文档:

创建运算  |  TensorFlow Core

必须是在源码目录下进行如下操作

tree tensorflow/learn/
tensorflow/learn/
|-- BUILD
`-- op
    `-- zero_out
        |-- test.sh
        |-- zero_out.cc
        `-- zero_out_test.py

zero_out.cc

#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/shape_inference.h"
#include "tensorflow/core/framework/op_kernel.h"
using namespace tensorflow;

REGISTER_OP("ZeroOut")
    .Input("to_zero: int32")
    .Output("zeroed: int32")
    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
      c->set_output(0, c->input(0));
      return Status::OK();
    });

class ZeroOutOp : public OpKernel {
 public:
  explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}

  void Compute(OpKernelContext* context) override {
    // Grab the input tensor
    const Tensor& input_tensor = context->input(0);
    auto input = input_tensor.flat<int32>();

    // Create an output tensor
    Tensor* output_tensor = NULL;
    OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(),
                                                     &output_tensor));
    auto output_flat = output_tensor->flat<int32>();

    // Set all but the first element of the output tensor to 0.
    const int N = input.size();
    for (int i = 1; i < N; i++) {
      output_flat(i) = 0;
    }

    // Preserve the first input value if possible.
    if (N > 0) output_flat(0) = input(0);
  }
};

REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);

zero_out_test.py

import tensorflow as tf

class ZeroOutTest(tf.test.TestCase):
  def testZeroOut(self):
    zero_out_module = tf.load_op_library('./zero_out.so')
    with self.test_session():
      result = zero_out_module.zero_out([5, 4, 3, 2, 1])
      self.assertAllEqual(result.numpy(), [5, 0, 0, 0, 0])
      print(result)

if __name__ == "__main__":
  tf.test.main(

test.sh

#!/bin/bash
src_dir=tensorflow/learn/op/zero_out
bazel build --config opt //tensorflow/learn:zero_out.so
cp -rf  bazel-bin/tensorflow/learn/zero_out.so ${src_dir}
cd ${src_dir}
python zero_out_test.py
rm -rf zero_out.so

运行测试

在tensorflow源码根目录下执行

./tensorflow/learn/op/zero_out/test.sh 
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=192
INFO: Reading rc options for 'build' from /data0/huozai/tensorflow/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /data0/huozai/tensorflow/.bazelrc:
  'build' options: --define framework_shared_object=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library --experimental_link_static_libraries_once=true
INFO: Reading rc options for 'build' from /data0/huozai/tensorflow/.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=/data0/huozai/miniconda2/envs/python3.10/bin/python3 --action_env PYTHON_LIB_PATH=. --python_path=/data0/huozai/miniconda2/envs/python3.10/bin/python3 --action_env PYTHONPATH=:.:.:.
INFO: Reading rc options for 'build' from /data0/huozai/tensorflow/.bazelrc:
  'build' options: --deleted_packages=tensorflow/compiler/mlir/tfrt,tensorflow/compiler/mlir/tfrt/benchmarks,tensorflow/compiler/mlir/tfrt/jit/python_binding,tensorflow/compiler/mlir/tfrt/jit/transforms,tensorflow/compiler/mlir/tfrt/python_tests,tensorflow/compiler/mlir/tfrt/tests,tensorflow/compiler/mlir/tfrt/tests/ir,tensorflow/compiler/mlir/tfrt/tests/analysis,tensorflow/compiler/mlir/tfrt/tests/jit,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_tfrt,tensorflow/compiler/mlir/tfrt/tests/tf_to_corert,tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data,tensorflow/compiler/mlir/tfrt/tests/saved_model,tensorflow/compiler/mlir/tfrt/transforms/lhlo_gpu_to_tfrt_gpu,tensorflow/core/runtime_fallback,tensorflow/core/runtime_fallback/conversion,tensorflow/core/runtime_fallback/kernel,tensorflow/core/runtime_fallback/opdefs,tensorflow/core/runtime_fallback/runtime,tensorflow/core/runtime_fallback/util,tensorflow/core/tfrt/common,tensorflow/core/tfrt/eager,tensorflow/core/tfrt/eager/backends/cpu,tensorflow/core/tfrt/eager/backends/gpu,tensorflow/core/tfrt/eager/core_runtime,tensorflow/core/tfrt/eager/cpp_tests/core_runtime,tensorflow/core/tfrt/gpu,tensorflow/core/tfrt/run_handler_thread_pool,tensorflow/core/tfrt/runtime,tensorflow/core/tfrt/saved_model,tensorflow/core/tfrt/graph_executor,tensorflow/core/tfrt/saved_model/tests,tensorflow/core/tfrt/tpu,tensorflow/core/tfrt/utils
INFO: Found applicable config definition build:short_logs in file /data0/huozai/tensorflow/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:v2 in file /data0/huozai/tensorflow/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:opt in file /data0/huozai/tensorflow/.tf_configure.bazelrc: --copt=-Wno-sign-compare --host_copt=-Wno-sign-compare
INFO: Found applicable config definition build:linux in file /data0/huozai/tensorflow/.bazelrc: --copt=-w --host_copt=-w --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++14 --host_cxxopt=-std=c++14 --config=dynamic_kernels --distinct_host_configuration=false --experimental_guard_against_concurrent_changes
INFO: Found applicable config definition build:dynamic_kernels in file /data0/huozai/tensorflow/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
INFO: Analyzed target //tensorflow/learn:zero_out.so (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //tensorflow/learn:zero_out.so up-to-date:
  bazel-bin/tensorflow/learn/zero_out.so
INFO: Elapsed time: 0.341s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
Running tests under Python 3.10.0: /data0/huozai/miniconda2/envs/python3.10/bin/python
[ RUN      ] ZeroOutTest.testZeroOut
WARNING:tensorflow:From /data0/huozai/miniconda2/envs/python3.10/lib/python3.10/contextlib.py:103: TensorFlowTestCase.test_session (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `self.session()` or `self.cached_session()` instead.
W0430 12:56:33.761502 140095322453824 deprecation.py:350] From /data0/huozai/miniconda2/envs/python3.10/lib/python3.10/contextlib.py:103: TensorFlowTestCase.test_session (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `self.session()` or `self.cached_session()` instead.
2022-04-30 12:56:33.762514: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
tf.Tensor([5 0 0 0 0], shape=(5,), dtype=int32)
INFO:tensorflow:time(__main__.ZeroOutTest.testZeroOut): 0.02s
I0430 12:56:33.782837 140095322453824 test_util.py:2458] time(__main__.ZeroOutTest.testZeroOut): 0.02s
[       OK ] ZeroOutTest.testZeroOut
[ RUN      ] ZeroOutTest.test_session
[  SKIPPED ] ZeroOutTest.test_session
----------------------------------------------------------------------
Ran 2 tests in 0.023s

OK (skipped=1)