V2EX › JasonLaw 的所有回复 › 第 32 页 / 共 36 页

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

1 ... 24 25 26 27 28 29 30 31 32 33 ... 36

❮

❯

2020-08-08 14:22:12 +08:00

回复了 JasonLaw 创建的主题 › 数据库 › 我对“Designing Data-Intensive Applications - Detecting Concurrent Writes”的理解对吗？

@guyskk0x0 #12 如果是数据库解决冲突，那当然可以，但是就像“Designing Data-Intensive Applications - CHAPTER 5 Replication - Leaderless Replication - Detecting Concurrent Writes - Last write wins (discarding concurrent writes)”中所说的“LWW achieves the goal of eventual convergence, but at the cost of durability: if there are several concurrent writes to the same key, even if they were all reported as successful to the client (because they were written to w replicas), only one of the writes will survive and the others will be silently discarded.”。我一直都是在强调数据库解决冲突的弊端。

@guyskk0x0 #6 我不觉得你前面所说的跟后面所说的是统一的，你前面说“没有合并那就应当有一个 client 失败”、“如果不合并数据就不一致了，所以必须有一个失败”，后面你却说“在 client 解决冲突”。

2020-08-08 13:14:52 +08:00

回复了 JasonLaw 创建的主题 › 数据库 › 我对“Designing Data-Intensive Applications - Detecting Concurrent Writes”的理解对吗？

@guyskk0x0 但是人更明白应该怎么处理冲突，而不是数据库，你觉得呢？我在 Android 和 iOS 并发修改了我的年龄，数据库怎么选择一个最终值呢？

2020-08-08 12:48:42 +08:00

回复了 JasonLaw 创建的主题 › 数据库 › 我对“Designing Data-Intensive Applications - Detecting Concurrent Writes”的理解对吗？

@guyskk0x0 #8 没错，是 CRDT，conflict-free replicated data type，“Merging concurrently written values”中讲了很多种方式，CRDT 是其中一种。但是这里怎么使用 CRDT 呢？

“没有合并那就应当有一个 client 失败”，如果不合并数据就不一致了，所以必须有一个失败。

我不同意你所说的“没有合并那就应当有一个 client 失败”。本主题的合并操作是让 client 去解决，比如说第 4 步时，client 2 知道 cart 有两个值，分别为[milk]和[eggs]，它想继续加 ham，于是它合并了两个并发写的值[milk]和[eggs]，然后加上 ham，最后将[eggs, milk, ham]发给数据库。根本不需要实现“Last write wins (discarding concurrent writes)”，“Designing Data-Intensive Applications - CHAPTER 5 Replication - Leaderless Replication - Detecting Concurrent Writes - Last write wins (discarding concurrent writes)”也讲了它所存在的问题。

我不同意你所说的“如果不合并数据就不一致了”，不一致是指多个 replicas 的数据不一致，合并并发写的值跟不一致没有太大关系。如果有两个 replicas，分别为 r1 和 r2 （ Multi-Leader Replication ）。client 1 向 r1 执行 set k 1，client 2 并发向 r1 执行 set k 2，最后 r1 保留了 k 的两个值，就算 r1 不执行合并操作，那么只要 r2 也达到了“k 有两个值，分别为 1 和 2”这种状态，那就是一致的。

期待你的回复。

2020-08-08 12:12:18 +08:00

回复了 JasonLaw 创建的主题 › 数据库 › 我对“Designing Data-Intensive Applications - Detecting Concurrent Writes”的理解对吗？

@guyskk0x0 #6

在“Designing Data-Intensive Applications - CHAPTER 5 Replication - Leaderless Replication - Detecting Concurrent Writes - Merging concurrently written values”中讲了怎么合并并发写的值，具体内容你可以看一下，这里就不贴出来了。

我还有几个疑问：
1. “如果有合并那就是 CRDT”是什么意思？
2. “没有合并那就应当有一个 client 失败”是指“不支持并发写”吗？这不就违背了 Multi-Leader Replication 和 Leaderless Replication 的初衷吗？

2020-08-07 22:55:35 +08:00

回复了 JasonLaw 创建的主题 › 数据库 › 关于“Designing Data-Intensive Applications - CHAPTER 5 Replication - Leaderless Replication”的疑问

@limboMu #1 可以具体解释一下以下两点吗？我不太明白你想表达的意思。

1. 这样就可以在同一个 replica 中知道使用哪个值来来覆盖，至于 replica3-5 有可能是就没收到写入。也有可能是因为延迟，总之集群返回了一个统一的值。可能这也是 leaderless replication 没法保证强一致性的原因之一。（具体解释这一整段，可以的话，用例子描述一下）
2. 还有就是有的 leaderless replication 是有一个错值回改的操作。（可以具体解释以下错值回改吗？）

2020-08-07 22:42:07 +08:00

回复了 JasonLaw 创建的主题 › 数据库 › 我对“Designing Data-Intensive Applications - Detecting Concurrent Writes”的理解对吗？

@guyskk0x0 #3 其实你说的跟我问的不太一样，你所说的“通过版本号实现的乐观并发控制”是不支持并发写的，即不支持“client 1 update table set value=101, version=2 where key= k and version=1”和“client 2 update table set value=102, version=2 where key= k and version=1”并发，只有一个先执行的 client 才会成功，后面的一个会被忽略。

而这种并发写是 Multi-Leader Replication 和 Leaderless Replication 的根本，我的问题也是关于 Leaderless Replication 如何处理并发写的。

2020-08-07 07:28:56 +08:00

回复了 JasonLaw 创建的主题 › 数据库 › 我对“Designing Data-Intensive Applications - Detecting Concurrent Writes”的理解对吗？

@ekoeko 其实更严格来说，client 2 是不知道 client 1 和它并发的，它只是知道返回了多个值，出现了并发，但是它不知道具体是和哪个 client 。或许更加合理的方式是说数据库知道哪个版本跟哪个版本并发了。你觉得呢？

2020-08-05 22:43:39 +08:00

回复了 autoname 创建的主题 › MySQL › 环形 sql 主从同步， 3 台机子挂掉其中一台该怎么办

在“Designing Data-Intensive Applications - CHAPTER 5 Replication - Multi-Leader Replication - Multi-Leader Replication Topologies”中，介绍了三种不同 replication topologies：1. circular topology 2. star topology 3. all-to-all topology 。

它说“A problem with circular and star topologies is that if just one node fails, it can interrupt the flow of replication messages between other nodes, causing them to be unable to communicate until the node is fixed. The topology could be reconfigured to work around the failed node, but in most deployments such reconfiguration would have to be done manually. The fault tolerance of a more densely connected topology (such as all-to-all) is better because it allows messages to travel along different paths, avoiding a single point of failure.”，也就是说如果你继续采用 circular topology 的话，如果其中一个节点失败了，你可以重新配置 topology，你也可以选择容错性更好的 all-to-all topology 。

当然不同的 topologies 都有各自的优缺点，具体的话，你可以看一下“Designing Data-Intensive Applications”这本书。

顺便说一下，你这个根本不是主从同步，而是 Multi-Leader Replication 。

2020-08-05 11:20:24 +08:00

回复了 sdfqwe 创建的主题 › MySQL › mysql 优化问题

@xsm1890 #9 说实话，这条语句真的一点意义都没有。
@sdfqwe 是你自己可以制造出来的吗？

2020-08-05 11:12:30 +08:00

回复了 sdfqwe 创建的主题 › MySQL › mysql 优化问题

@sdfqwe 你的表为什么是这样子的？

1. nodeId 已经是 PRIMARY KEY 了，为什么还要定义一个多余的索引 nodeid 呢？
2. 为什么不直接叫 id 和 parentId 呢？ nodeId 不会有什么业务含义吧？
3. 为什么城市表会有层级关系的呢？

2020-08-05 10:14:30 +08:00

回复了 sdfqwe 创建的主题 › MySQL › mysql 优化问题

@sdfqwe #4 那就可以完全解释了，PRIMARY 和 nodeid 两个 index 相关的列都是 nodeid 。

2020-08-04 22:30:46 +08:00

回复了 sdfqwe 创建的主题 › MySQL › mysql 优化问题

但是这无法解释第三和第四条语句会使用 PRIMARY 那个 index，难道 PRIMARY 和 nodeid 两个 index 相关的列都是 nodeid ？

提供多一点信息吧，表结构以及数据。不然无法知道什么原因。

2020-08-04 21:25:52 +08:00

回复了 sdfqwe 创建的主题 › MySQL › mysql 优化问题

如果是的话，那就可以解释了。

查询优化器可能将 select b.* from b_city b where b.nodeid in (select a.nodeid from b_city a)转化为 select b.* from b_city b where
exists(select 1 from b_city a where a.nodeid = b.nodeid)，然后又转化为 select b.* from b_city b where
exists(select 1 from b_city a where a.id = b.id)，它的执行可以做到跟展示的执行计划匹配，可以逻辑地理解为“检查 b_city 的每一行，对于每一行，查询是否存在 id 为这行 id 的 b_city”。

至于为什么不限执行 select a.nodeid from b_city a，不管是怎样，你都要检查 b_city b 的每一行，相比于检查 b_city b 的每一行的 nodeid 是否存在于 select a.nodeid from b_city a 所代表的集合中，为什么不直接检查 b_city a 中是否存在 nodeid 为 b_city b 行 nodeid 的行呢？如果 nodeid 索引是唯一索引并且 nodeid 是 not null 的话，它甚至可以做到“检查 b_city b 的每一行，检查 b_city a 中是否存在 id 为 b_city b 行 id 的行”。

2020-08-04 20:21:43 +08:00

回复了 sdfqwe 创建的主题 › MySQL › mysql 优化问题

nodeid 索引是唯一索引并且 nodeid 是 not null 。对吗？

2020-08-04 10:05:08 +08:00

回复了 zzhpeng 创建的主题 › MySQL › 大佬们求救，慢 SQL 问题

@kimqcn #55 “是不是 IS NULL=全表扫描”，可以看一下这个相关讨论 https://www.v2ex.com/t/694500 。

2020-08-04 09:21:57 +08:00

回复了 zzhpeng 创建的主题 › MySQL › 大佬们求救，慢 SQL 问题

最近在看《数据密集型应用系统设计》，如果对你现在的情况合理的话，或许你应该分开 OLTP 和 OLAP 。

2020-08-03 17:28:21 +08:00

回复了 zzhpeng 创建的主题 › MySQL › 大佬们求救，慢 SQL 问题

@zzhpeng #26

我主要是表达“条件为 is_virtual = 0，并且大多数的 is_virtual 都是 0”的时候，idx_storeid_isvirtual 其实是没有什么用处的。

你那条语句是直接在同一个环境运行的吗？如果是的话，单独运行肯定跟同其他语句并发运行是不一样的，我不清楚具体的情况，无法给你答案。

2020-08-03 17:16:48 +08:00

回复了 zzhpeng 创建的主题 › MySQL › 大佬们求救，慢 SQL 问题

@GTim #18 你理解错了，具体情况具体分析吧。

1. 你说“如果没有聚合函数，select 是在 order by -> limit 后面执行的”，此问题的例子就能够说明这种说法是错的。
2. 你说“但是有了聚合函数，select 会优于 order by -> limit 执行”，假设 SQL 是 select sum(c1) from t group by c2 order by c2，如果先执行 select sum(c1)，之后的 order by c2 怎么能够成功呢？

2020-08-03 17:06:33 +08:00

回复了 zzhpeng 创建的主题 › MySQL › 大佬们求救，慢 SQL 问题

@zzhpeng #21 所以你的索引基本没什么用，甚至会降低速度，因为对于每个符合条件的 secondary index 子节点，都会去 clustered index 搜索。

2020-08-03 16:49:02 +08:00

回复了 zzhpeng 创建的主题 › MySQL › 大佬们求救，慢 SQL 问题

如果有人好奇“为什么加了 LIMIT 1，COUNT(*)还可以正常工作的话”，可以看一下 https://stackoverflow.com/questions/17020842/mysql-count-with-limit 。

1 ... 24 25 26 27 28 29 30 31 32 33 ... 36

❮

❯